Actually we are re-using JsonReader for parsing. So we also verified the same with "dfs" storage, it seems issue exist with JsonReader.
Kamesh raised this issue(DRILL-1460 <https://issues.apache.org/jira/browse/DRILL-1460>) for the same. Once it is fixed in JsonReader, it should also work with mongo storage plugin. Thanks & Regards, B Anil Kumar. On Sat, Sep 27, 2014 at 1:37 PM, Kamesh <[email protected]> wrote: > Thanks Jinfeng & Neeraja for looking into this. > We will look into the above mentioned issues. > > > > On Sat, Sep 27, 2014 at 8:28 AM, Neeraja Rentachintala < > [email protected]> wrote: > >> I have played with the plugin as well today and overall its very good. >> >> I tried the queries >> http://docs.mongodb.org/manual/tutorial/aggregation-zip-code-data-set/ >> on the zip code dataset and all the aggregate queries worked. >> >> >> ----------- >> >> select sum(pop) from zipcodes where city='SEATTLE’; >> >> select state, city, sum(pop) from zipcodes group by state,city order by >> sum(pop) asc limit 1; >> >> select state,city,avg(pop) from zipcodes group by state, city; >> >> select city, sum(pop) from zipcodes group by city order by sum(pop) asc >> limit 1; >> >> select state,sum(pop) from zipcodes group by state having sum(pop) > >> 10000000; >> >> >> ---------- >> >> >> I however noticed issues with querying repeating elements (used USDA >> nutrition dataset), especially more than one level nested as well as JOINs >> (example queries are below) >> >> ------------------ >> >> 0: jdbc:drill:zk=local> SELECT t1.first_name FROM >> mongo.employee.`empinfo` t1 JOIN mongo.employee.`empinfo` t2 ON >> t1.`employee_id` = t2.`employee_id`; >> >> Query failed: Failure while setting up Foreman. Internal error: Error >> while applying rule DrillPushProjIntoScan, args >> [rel#12606:ProjectRel.NONE.ANY([]).[](child=rel#12598:Subset#0.ENUMERABLE.ANY([]).[],employee_id=$1,first_name=$2), >> rel#12594:EnumerableTableAccessRel.ENUMERABLE.ANY([]).[](table=[mongo, >> employee, empinfo])] [08f4eedd-f5c9-4ebf-8d5b-d9249b79ca32] >> >> >> 0: jdbc:drill:zk=local> select t.nutrients from mongo.usda.nutrition t >> limit 1; >> >> Query failed: Screen received stop request sent. You tried to write a >> BigInt type when you are using a ValueWriter of type >> NullableFloat8WriterImpl. [dc44e277-1b1d-4f00-b60e-9f06b883e7c5] >> >> >> Error: exception while executing query: Failure while trying to get next >> result batch. (state=,code=0) >> >> 0: jdbc:drill:zk=local> select t.nutrients[0].units from >> mongo.usda.nutrition t limit 1; >> >> Query failed: Screen received stop request sent. You tried to write a >> BigInt type when you are using a ValueWriter of type >> NullableFloat8WriterImpl. [a285c85e-4607-48fc-97af-41b5726459e2] >> >> >> Error: exception while executing query: Failure while trying to get next >> result batch. (state=,code=0) >> >> >> >> On Fri, Sep 26, 2014 at 6:07 PM, Jinfeng Ni <[email protected]> wrote: >> >>> >>> ----------------------------------------------------------- >>> This is an automatically generated e-mail. To reply, visit: >>> https://reviews.apache.org/r/25996/#review54756 >>> ----------------------------------------------------------- >>> >>> Ship it! >>> >>> >>> I did not do a detail code review; let that task to Steven. I mainly >>> played with this Mongo plugin. Overall it looks good. >>> >>> Basically, I start a mongodb instance, import the data, and run several >>> single table queryies, and all of them work perfectly. >>> >>> Some issues I saw when playing around : >>> >>> 1. The result of select * seems not the expect answer : it would return >>> a map containing all the columns: >>> >>> SELECT * FROM mongo.employee.`empinfo` limit 2; >>> +------------+ >>> | * | >>> +------------+ >>> | { "employee_id" : 1101 , "full_name" : "Steve Eurich" , "first_name" : >>> "Steve" , "last_name" : "Eurich" , "position_id" : 16 , "position" : "Store >>> T" , "isFTE" : true} | >>> | { "employee_id" : 1102 , "full_name" : "Mary Pierson" , "first_name" : >>> "Mary" , "last_name" : "Pierson" , "position_id" : 16 , "position" : "Store >>> T" , "isFTE" : true} | >>> +------------+ >>> 2 rows selected (0.084 seconds) >>> >>> In contrast, here is the result when Drill queries a .json file: >>> >>> select * from cp.`employee.json` limit 2; >>> >>> +-------------+------------+------------+------------+-------------+----------------+------------+---------------+------------+------------+------------+---------------+-----------------+----------------+------------+-----------------+ >>> | employee_id | full_name | first_name | last_name | position_id | >>> position_title | store_id | department_id | birth_date | hire_date | >>> salary | supervisor_id | education_level | marital_status | gender | >>> management_role | >>> >>> +-------------+------------+------------+------------+-------------+----------------+------------+---------------+------------+------------+------------+---------------+-----------------+----------------+------------+-----------------+ >>> | 1 | Sheri Nowmer | Sheri | Nowmer | 1 | >>> President | 0 | 1 | 1961-08-26 | 1994-12-01 >>> 00:00:00.0 | 80000.0 | 0 | Graduate Degree | S >>> | F | Senior Management | >>> | 2 | Derrick Whelply | Derrick | Whelply | 2 >>> | VP Country Manager | 0 | 1 | 1915-07-03 | >>> 1994-12-01 00:00:00.0 | 40000.0 | 1 | Graduate Degree | M >>> | M | Senior Management | >>> >>> +-------------+------------+------------+------------+-------------+----------------+------------+---------------+------------+------------+------------+---------------+-----------------+----------------+------------+-----------------+ >>> 2 rows selected (0.39 seconds) >>> >>> >>> 2. Join two mongodb tables would fail. >>> >>> SELECT t1.first_name, t2.last_name FROM mongo.employee.`empinfo` t1, >>> mongo.employee.`empinfo` t2 where t1.`employee_id` = t2.`employee_id` limit >>> 1; >>> Query failed: Failure while setting up Foreman. Internal error: while >>> converting `t1`.`employee_id` = `t2`.`employee_id` >>> [39eb6c88-fd21-4514-8903-48d99210b88d] >>> >>> 3. Join a mongodb table with a table with other storage engine would >>> fail with CanNotPlanException: >>> >>> SELECT t1.first_name, t2.last_name FROM mongo.employee.`empinfo` t1, >>> mongo.employee.`empinfo` t2 where t1.`employee_id` = t2.`employee_id` limit >>> 1; >>> Query failed: Failure while setting up Foreman. Internal error: while >>> converting `t1`.`employee_id` = `t2`.`employee_id` >>> [39eb6c88-fd21-4514-8903-48d99210b88d] >>> >>> Error: exception while executing query: Failure while trying to get next >>> result batch. (state=,code=0) >>> 0: jdbc:drill:zk=local> SELECT t1.first_name, t1.last_name FROM >>> mongo.employee.`empinfo` as t1, cp.`employee.json` t2 where t1.employee_id >>> = t2.employee_id limit 10; >>> Query failed: Failure while parsing sql. Node >>> [rel#2496:Subset#5.LOGICAL.ANY([]).[]] could not be implemented; planner >>> state: >>> >>> Root: rel#2496:Subset#5.LOGICAL.ANY([]).[] >>> Original rel: >>> ...... >>> >>> 4. Select *, regular_column from mongodb would return the regular_column >>> as null. >>> >>> 0: jdbc:drill:zk=local> SELECT first_name FROM mongo.employee.`empinfo` >>> limit 2; >>> +------------+ >>> | first_name | >>> +------------+ >>> | Steve | >>> | Mary | >>> +------------+ >>> 2 rows selected (0.084 seconds) >>> 0: jdbc:drill:zk=local> SELECT *, first_name FROM >>> mongo.employee.`empinfo` limit 2; >>> +------------+------------+ >>> | * | first_name | >>> +------------+------------+ >>> | { "employee_id" : 1101 , "full_name" : "Steve Eurich" , "first_name" : >>> "Steve" , "last_name" : "Eurich" , "position_id" : 16 , "position" : "Store >>> T" , "isFTE" : true} | null | >>> | { "employee_id" : 1102 , "full_name" : "Mary Pierson" , "first_name" : >>> "Mary" , "last_name" : "Pierson" , "position_id" : 16 , "position" : "Store >>> T" , "isFTE" : true} | null | >>> +------------+------------+ >>> >>> >>> >>> I think it would be fine to fix those issues in the next release. >>> >>> >>> PS: could you please re-build a patch after rebasing on the recent >>> master branch? >>> >>> - Jinfeng Ni >>> >>> >>> On Sept. 24, 2014, 11:06 a.m., Anil Kumar B wrote: >>> > >>> > ----------------------------------------------------------- >>> > This is an automatically generated e-mail. To reply, visit: >>> > https://reviews.apache.org/r/25996/ >>> > ----------------------------------------------------------- >>> > >>> > (Updated Sept. 24, 2014, 11:06 a.m.) >>> > >>> > >>> > Review request for drill, Aditya Kishore, Jacques Nadeau, and Kamesh B. >>> > >>> > >>> > Repository: drill-git >>> > >>> > >>> > Description >>> > ------- >>> > >>> > Mongo storage plugin support: The features which we implemented as >>> part of this is as follows. >>> > 1) Support for sharded(chunk wise), shared-replicated(chunk wise), >>> replicated, stand-alone >>> > 2) Predicate pushdown >>> > 3) Mongo PStore >>> > >>> > MongoRecordReader uses JsonReaderWithState in the case of non-star >>> queries. >>> > >>> > >>> > Diffs >>> > ----- >>> > >>> > contrib/pom.xml 728038a >>> > contrib/storage-mongo/pom.xml PRE-CREATION >>> > >>> >>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/DrillMongoConstants.java >>> PRE-CREATION >>> > >>> >>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoCnxnManager.java >>> PRE-CREATION >>> > >>> >>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoCompareFunctionProcessor.java >>> PRE-CREATION >>> > >>> >>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoFilterBuilder.java >>> PRE-CREATION >>> > >>> >>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoGroupScan.java >>> PRE-CREATION >>> > >>> >>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoPushDownFilterForScan.java >>> PRE-CREATION >>> > >>> >>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoRecordReader.java >>> PRE-CREATION >>> > >>> >>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoScanBatchCreator.java >>> PRE-CREATION >>> > >>> >>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoScanSpec.java >>> PRE-CREATION >>> > >>> >>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoStoragePlugin.java >>> PRE-CREATION >>> > >>> >>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoStoragePluginConfig.java >>> PRE-CREATION >>> > >>> >>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoSubScan.java >>> PRE-CREATION >>> > >>> >>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoUtils.java >>> PRE-CREATION >>> > >>> >>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/common/ChunkInfo.java >>> PRE-CREATION >>> > >>> >>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/common/MongoCompareOp.java >>> PRE-CREATION >>> > >>> >>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/config/MongoPStore.java >>> PRE-CREATION >>> > >>> >>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/config/MongoPStoreProvider.java >>> PRE-CREATION >>> > >>> >>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/schema/MongoDatabaseSchema.java >>> PRE-CREATION >>> > >>> >>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/schema/MongoSchemaFactory.java >>> PRE-CREATION >>> > >>> contrib/storage-mongo/src/main/resources/bootstrap-storage-plugins.json >>> PRE-CREATION >>> > contrib/storage-mongo/src/main/resources/drill-module.conf >>> PRE-CREATION >>> > >>> >>> contrib/storage-mongo/src/test/java/org/apache/drill/exec/store/mongo/TestMongoChunkAssignment.java >>> PRE-CREATION >>> > distribution/pom.xml cd5df0d >>> > distribution/src/assemble/bin.xml 86e3802 >>> > >>> exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java >>> 933bfbe >>> > >>> >>> exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java >>> 4fa61e1 >>> > >>> >>> exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/fn/JsonReader.java >>> 4e12b8b >>> > >>> >>> exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/fn/JsonReaderWithState.java >>> ef995f8 >>> > >>> > Diff: https://reviews.apache.org/r/25996/diff/ >>> > >>> > >>> > Testing >>> > ------- >>> > >>> > 1) Tested various set of queries on sharded, replicated and >>> stand-alone modes. >>> > >>> > 2) Test Environment details: We created mongo cluster with 2 shards >>> with a collections consists of 35 chunks(18 chunks are one shard and >>> remaining chunks on on other shard). Below are the few queries which we >>> tested in all the environments. >>> > >>> > a) SELECT * FROM mongo.employee.`empinfo` limit 10; >>> > >>> > b) SELECT first_name, last_name FROM mongo.employee.`empinfo` >>> limit 10; >>> > >>> > c) SELECT first_name, last_name FROM mongo.employee.`empinfo` >>> where employee_id = 1111; >>> > >>> > d) SELECT * FROM mongo.employee.`empinfo` where full_name = >>> 'Phil Munoz'; >>> > >>> > e) SELECT first_name, last_name, position_id FROM >>> mongo.employee.`empinfo` where employee_id = 1111 OR position_id = 16; >>> > >>> > g) SELECT first_name, last_name FROM mongo.employee.`empinfo` >>> where isFTE = true; >>> > >>> > h) SELECT first_name, last_name, position_id FROM >>> mongo.employee.`empinfo` where employee_id = 1107 AND position_id = 17 AND >>> last_name = 'Yonce'; >>> > >>> > >>> > 3) PStore functionality not fully tested. >>> > >>> > >>> > Thanks, >>> > >>> > Anil Kumar B >>> > >>> > >>> >>> >> > > > -- > Kamesh. >
