Hi,

First i want to say MRQL is really attracting me, i can see benefit of it
over hive, impala, etc.

Recently, i was working on opensource project called Zeppelin (
http://zeppelin-project.org). Last few days, i tried to make MRQL driver
for Zeppelin.

Result are https://github.com/nflabs/zeppelin-driver-mrql. it's still
experimental, but it's working well.
I attached screenshot, and you'll see how it works.

[image: Inline image 1]

While implementing MRQL driver for Zeppelin, I've found few tasks to
 improve MRQL.

*1. Source code structure*

a. Currently whole source code stays in 'src' directory under project root.
I think it's more common to make source code stay under each submodule.
ex)
/core/src/main/java
/gen/src/main/java
/spark/src/main/java
....

b. Also it's quite common to put source code under package name'd directory.
ex) if mrql.java and package is org.apache.mrql then place this file under
/core/src/main/java/org/apache/mrql/mrql.java

*2. Unit testing*

MRQL does not provide any unittest. I think that'll slow down development
process and make things hard to change/verify.
While MRQL itself looks like pretty much unittest friendly - for example it
support 'memory' mode to evaluate query - it'll not be difficult to add
some unittests.

*3. Static everywhere*

I didn't deeply understand source code. But i can see a lot of static
variables. They're everywhere and make things difficult.

a) Difficult to understand source code.
b) Can not run in parallel (not thread-safe)

Currently MRQL runs using commandline and in this case, thread-safety is
not a big problem. but for the people who want to embed MRQL, it'll be
trouble.


If i can get some feedback about tasks i listed, it'll be great.
After discussion, i hope i can spend some time for improving MRQL.

Thanks.
--------
Best,
moon

Reply via email to