Thanks a couple more things below... On Thu, Oct 2, 2014 at 11:50 AM, Julian Hyde <[email protected]> wrote:
> Glad you found the Mongo adapter. It’s definitely closer to what you want. > > Questions such as [1], and also Andrew Selden’s experience working on an > Elasticsearch adapter [4] have made me think that an interpreter [5] might > be useful, so you can execute queries without converting expressions to > java strings and back again. There is a partial implementation already. > Would an interpreter be useful to you? > Honestly I am not sure. Im still new at all this so I am not sure it's needed, but this these will be most likely short-lived queries so that sounds like a good fit for that. > > On Oct 2, 2014, at 10:17 AM, Dan Di Spaltro <[email protected]> > wrote: > > > For instance in rocksdb > > everything besides the primary key is a table scan [2]. And it works > > like a cursor, you just iterate over the values. Ideally during that > > iteration you could apply the simple filtering. > > By the way, HBase works in a similar way. It is an ambition of mine (and > James Taylor’s) to find a way to make bring Calcite and Phoenix together > somehow. > Yeah I am very familiar with Hbase, there are two basic differences, it's a 2 level hash row -> cf/cq -> value, and you can push some filtering without a coprocessor using fuzzyfilter etc. Another interesting point is that with RDB you can use as a materialized view (in mem) with the wal's stored in hdfs, so the actual ops come from memory, so you can do some neat stuff. > > > Like I mentioned above this is where I am getting tripped up, since > > it's such a basic datastore, I am having a hard time grokking how to > > express that. > > > > I was thinking of using janino to compile to a java expression and > > passing that to the iteration engine, but that is going to take some > > time. > > What is the Java API to RocksDB? I found [6] and RocksDB [7] and > RocksIterator [8]. > yeah the code below looks good. > > One way to think about this is to choose a reasonably challenging query, > implement it by hand (post the java code to this list) and then we’ll > figure out how to generate that code (or generate calls to a helper class > that has the same effect). > > If for example the query is “select … from emp where id between 10 and > 20”, my guess is that you’d write > > RocksDB db = …; > RocksIterator iter = db.iterator(); > bytes[] start = toBytes(10); > bytes[] end = toBytes(20); > iter.seek(start); > while (iter.isValid()) { > bytes[] k = iter.key(); > if (compare(k, end) > 0) { > break; > } > bytes[] v = iter.value(); > // emit (k, v) somehow > iter.next(); > } > > Then you need to package that as an Enumerable. > Then generalize it into a scan that can take start value, end value of > various types. > Interesting, so are you suggesting that I could create different enumerables by the operations that are invoked? For instance if you have: select id,name from emp where id between 10 and 20 and name = "bill" You would want to pass down id filter (which would translate to a seek, potentially) and for name you'd want to filter that during iteration. You'd also take into consideration when using an in clause to sort the literal set then seek. Anyways, once I get the basics layering on this should make more sense. I am still kinda missing how I pass things down to the physical layer when they aren't queries, a more full featured example would help, anyways Ill keep hacking at it. > > >> Create a RocksConvention, a RockRel interface, and some rules: > >> > >> RocksProjectRule: ProjectRel on a RocksRel ==> RocksProjectRel > >> RocksFilterRule: FilterRel on RocksRel ==> RocksFilterRel > > > > As an example thats what's this is conveying right [3]? > > Yes. > > >> ArrayTable would be useful if you want to cache data sets in memory. As > always with caching, I’d suggest you skip it in version 1. > > > > I wasn't sure if I could subclass it and use the interesting bits > > since rdb deals with array of bytes, but since serialization isn't > > what I am confused on Ill skip this question. > > Yeah, ArrayTable needs things to be in its own particular format. Not > appropriate for what you want. > > Julian > > [1] > http://mail-archives.apache.org/mod_mbox/incubator-optiq-dev/201409.mbox/%3CCANQjSRNDKkRgqW839-0zpjhHW_hExWxEXA%2B8mCxO8-a2nRX1oA%40mail.gmail.com%3E > [2] https://github.com/facebook/rocksdb/wiki/Basic-Operations#iteration > [3] > https://github.com/apache/incubator-optiq/blob/90f0bead8923dfb28992b60baee8d8cb92c18d9e/mongodb/src/main/java/net/hydromatic/optiq/impl/mongodb/MongoRules.java#L218 > [4] > https://github.com/aleph-zero/incubator-optiq/tree/elasticsearch-optiq-0.9.0-incubating > [5] https://issues.apache.org/jira/browse/OPTIQ-416 > [6] https://github.com/facebook/rocksdb/wiki/RocksJava-Basics > [7] > https://github.com/facebook/rocksdb/blob/master/java/org/rocksdb/RocksDB.java > [8] > https://github.com/facebook/rocksdb/blob/master/java/org/rocksdb/RocksIterator.java > > > > -- Dan Di Spaltro
