[Ometa] On OMeta's Syntax

Alessandro Warth alexwarth at gmail.com
Fri Jul 4 20:18:30 PDT 2008


Hello everybody,

In light of the messages I've seen over the past week or so, I though it
would be a good idea for me to say a few words about OMeta's syntax. Ok,
maybe not just a few words :)

*Pointy-bracket vs. non-pointy-bracket syntax
*
I've experimented with two variants. First, there was the one with pointy
brackets around rule applications, which was used in the DLS paper and in
OMeta/Squeak. By the time I implemented OMeta/JS, I had used the
pointy-bracket syntax enough to know that it was a little too "heavy", so I
made a few changes which included getting rid of the pointy brackets. Here
are a couple of examples to illustrate some of these changes:

*OMeta/Squeak:*

   - addExpr ::=3D <addExpr>:x <token '+'> <mulExpr>:y =3D> [{#+. x. y}]
             | <mulExpr>
   - listOf :rule :sep ::=3D <apply rule>:x (<token sep> <apply rule>)*:xs =
=3D>
   [Array with: x withAll: xs]

*OMeta/JS:*

   - addExpr =3D addExpr:x "+" mulExpr:y -> ["+", x, y]
           | mulExpr
   - listOf :rule :sep =3D apply(rule):x (token(sep) apply(rule))*:xs ->
   xs.splice(0, 0, x)

You can probably work out exactly what the changes are from looking at these
examples, but I'll go over the main ones anyway:

   1. *old rule definition "operator"*: ::=3D
   *new syntax*: =3D
   2. *old rule application*: <foo>
   *new syntax*: foo
   3. *old parameterized rule application*: <foo bar>
   *new syntax*: foo(bar)
   4. *old semantic action*: =3D> [*squeak code*]
   *new syntax*: -> *JavaScript primary expression
   *

There is also a new syntactic sugar "xyz", which means the same thing as
token("xyz"). You can see it being used in addExpr above.  This sugar has
been great for making grammars more readable, but unfortunately not
everybody knows about it (this is 100% my fault) so it has also been a
source of confusion.

*Note*: the token rule is not special in any way, so you're free to define
it to do whatever you want. By default, it skips any number of spaces and
then tries to match the sequence of characters that was passed to it as an
argument.

*A bit of commentary (or: Hindsight is 20/20)
*
I much prefer the non-pointy-bracket syntax of OMeta/JS, but I must admit
that I made a mistake when it comes to the syntax of semantic actions.

In OMeta/Squeak, the Squeak code inside a semantic action was always
delimited by square brackets. In OMeta/JS, I thought it would be much nicer
to not use delimiters altogether; I did this by restricting the kind of
expression that can be used in a semantic action to what JavaScript calls a
"primary expression".

In theory, this change was perfectly fine. In practice, however, this was
not a good idea because it makes it very easy for programmers to make nasty
mistakes... Here's an example:

addExpr =3D addExpr:x "+" mulExpr:y -> x + y
        | mulExpr

The rule above *looks* right, but it isn't because the programmer forgot
about the "primary expression" thing. Here's what addExpr actually *means*:

addExpr =3D addExpr:x "+" mulExpr:y (-> x)+ y
        | mulExpr

In other words, only the x is part of the semantic action; OMeta/JS thinks
the + is a *Kleene-+* (a.k.a. the one-or-more operator) that is being
applied to the semantic action (which, in this case, makes no sense), and
that the y is a rule application!

Sure, the programmer should have written

addExpr =3D addExpr:x "+" mulExpr:y -> (x + y)
        | mulExpr

(with parentheses around the x + y)... but the fact of the matter is that
the new "delimiterless" semantic action syntax makes this kind of mistake
very easy to make, which is a very bad thing.

In the next version of OMeta/JS, I will fix this problem by requiring curly
brackets around semantic actions.

(BTW, if you're implementing OMeta/Some-S-Expression-Based-Language (e.g.,
LISP or Scheme), you don't have to worry about this problem. After all, it's
always obvious where an s-expression begins and ends, so there is no need
for delimiters.)

*Some advice for people who are porting OMeta to different platforms
*
... and this brings me to another other point I wanted to make.

The idea of making OMeta's syntax more uniform across different platforms
has come up recently. I think this could be very helpful, but I would hate
to see people taking it too seriously.

I've often described OMeta as a "parasitic language", meaning that it
doesn't exist on its own, i.e., it needs to live on top of a "host
language". One of the nice things about having an OMeta/X---where X is some
host language---is that it enables transformations on X's data structures to
be expressed pretty naturally.

Different host languages have different kinds of data structures, each with
their own syntax. Take arrays, for example. Here's one in JavaScript, [1, 2,
3], and the same one in Smalltalk, {1. 2. 3}. If we were to "standardize"
the syntax of OMeta's listy patterns, which are used to match array objects,
the standard syntax couldn't possibly look natural for all host languages.
So rather than trying to come up with some standard syntax for OMeta's
patterns, I think whoever is porting OMeta to language X is better off
letting the syntax of X influence the syntax of OMeta/X's patterns. Also,
there is a good chance that you'll want OMeta/X to include a few new kinds
of patterns for pattern-matching against some X-specific data structures.

*Character and string patterns*

This is another issue that has come up recently, and I think makes for a
good illustration of my previous point.

OMeta/Squeak has patterns for characters (e.g., $x) as well as strings
(e.g., 'hello'). The syntax of these patterns is exactly what you'd expect
if you're a Smalltalker.

OMeta/JS has also has string patterns (e.g., 'hello'), but it doesn't have
character patterns. This is not because I decided that character patterns
are bad, but rather because there is no such thing as a character object in
JavaScript. If you ask a string for its first element, you get a string of
length 1 (e.g., 'hello'[0] =3D=3D 'h').

Now, for OMeta#, it probably makes sense for character patterns to look like
'x' and for string patterns to look like "abc", since that's how you write
character and string literals in C#.


I think that's all for now... Happy 4th of July!

Cheers,
Alex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://vpri.org/pipermail/ometa/attachments/20080704/b2de53bb/attachme=
nt.htm


More information about the OMeta mailing list