Actually we are re-using JsonReader for parsing. So we also verified the
same with "dfs" storage, it seems issue exist with JsonReader.

Kamesh raised this issue(DRILL-1460
<https://issues.apache.org/jira/browse/DRILL-1460>) for the same. Once it
is fixed in JsonReader, it should also work with mongo storage plugin.

Thanks & Regards,
B Anil Kumar.

On Sat, Sep 27, 2014 at 1:37 PM, Kamesh <[email protected]> wrote:

> Thanks Jinfeng & Neeraja for looking into this.
> We will look into the above mentioned issues.
>
>
>
> On Sat, Sep 27, 2014 at 8:28 AM, Neeraja Rentachintala <
> [email protected]> wrote:
>
>> I have played with the plugin as well today and overall its very good.
>>
>> I tried the queries
>> http://docs.mongodb.org/manual/tutorial/aggregation-zip-code-data-set/
>> on the zip code dataset and all the aggregate queries worked.
>>
>>
>> -----------
>>
>> select sum(pop) from zipcodes where city='SEATTLE’;
>>
>> select state, city, sum(pop) from zipcodes group by state,city order by
>> sum(pop) asc limit 1;
>>
>> select state,city,avg(pop) from zipcodes group by state, city;
>>
>> select city, sum(pop) from zipcodes group by city order by sum(pop) asc
>> limit 1;
>>
>> select state,sum(pop) from zipcodes group by state having sum(pop) >
>> 10000000;
>>
>>
>> ----------
>>
>>
>> I however noticed issues with querying repeating elements (used USDA
>> nutrition dataset), especially more than one level nested as well as JOINs
>> (example queries are below)
>>
>> ------------------
>>
>> 0: jdbc:drill:zk=local> SELECT t1.first_name FROM
>> mongo.employee.`empinfo` t1 JOIN  mongo.employee.`empinfo` t2 ON
>> t1.`employee_id` = t2.`employee_id`;
>>
>> Query failed: Failure while setting up Foreman. Internal error: Error
>> while applying rule DrillPushProjIntoScan, args
>> [rel#12606:ProjectRel.NONE.ANY([]).[](child=rel#12598:Subset#0.ENUMERABLE.ANY([]).[],employee_id=$1,first_name=$2),
>> rel#12594:EnumerableTableAccessRel.ENUMERABLE.ANY([]).[](table=[mongo,
>> employee, empinfo])] [08f4eedd-f5c9-4ebf-8d5b-d9249b79ca32]
>>
>>
>> 0: jdbc:drill:zk=local> select t.nutrients from mongo.usda.nutrition t
>> limit 1;
>>
>> Query failed: Screen received stop request sent. You tried to write a
>> BigInt type when you are using a ValueWriter of type
>> NullableFloat8WriterImpl. [dc44e277-1b1d-4f00-b60e-9f06b883e7c5]
>>
>>
>> Error: exception while executing query: Failure while trying to get next
>> result batch. (state=,code=0)
>>
>> 0: jdbc:drill:zk=local> select t.nutrients[0].units from
>> mongo.usda.nutrition t limit 1;
>>
>> Query failed: Screen received stop request sent. You tried to write a
>> BigInt type when you are using a ValueWriter of type
>> NullableFloat8WriterImpl. [a285c85e-4607-48fc-97af-41b5726459e2]
>>
>>
>> Error: exception while executing query: Failure while trying to get next
>> result batch. (state=,code=0)
>>
>>
>>
>> On Fri, Sep 26, 2014 at 6:07 PM, Jinfeng Ni <[email protected]> wrote:
>>
>>>
>>> -----------------------------------------------------------
>>> This is an automatically generated e-mail. To reply, visit:
>>> https://reviews.apache.org/r/25996/#review54756
>>> -----------------------------------------------------------
>>>
>>> Ship it!
>>>
>>>
>>> I did not do a detail code review; let that task to Steven. I mainly
>>> played with this Mongo plugin. Overall it looks good.
>>>
>>> Basically, I start a mongodb instance, import the data, and run several
>>> single table queryies, and all of them work perfectly.
>>>
>>> Some issues I saw when playing around :
>>>
>>> 1. The result of select * seems not the expect answer : it would return
>>> a map containing all the columns:
>>>
>>> SELECT * FROM mongo.employee.`empinfo` limit 2;
>>> +------------+
>>> |     *      |
>>> +------------+
>>> | { "employee_id" : 1101 , "full_name" : "Steve Eurich" , "first_name" :
>>> "Steve" , "last_name" : "Eurich" , "position_id" : 16 , "position" : "Store
>>> T" , "isFTE" : true} |
>>> | { "employee_id" : 1102 , "full_name" : "Mary Pierson" , "first_name" :
>>> "Mary" , "last_name" : "Pierson" , "position_id" : 16 , "position" : "Store
>>> T" , "isFTE" : true} |
>>> +------------+
>>> 2 rows selected (0.084 seconds)
>>>
>>> In contrast, here is the result when Drill queries a .json file:
>>>
>>> select * from cp.`employee.json` limit 2;
>>>
>>> +-------------+------------+------------+------------+-------------+----------------+------------+---------------+------------+------------+------------+---------------+-----------------+----------------+------------+-----------------+
>>> | employee_id | full_name  | first_name | last_name  | position_id |
>>> position_title |  store_id  | department_id | birth_date | hire_date  |
>>>  salary   | supervisor_id | education_level | marital_status |   gender   |
>>> management_role |
>>>
>>> +-------------+------------+------------+------------+-------------+----------------+------------+---------------+------------+------------+------------+---------------+-----------------+----------------+------------+-----------------+
>>> | 1           | Sheri Nowmer | Sheri      | Nowmer     | 1           |
>>> President      | 0          | 1             | 1961-08-26 | 1994-12-01
>>> 00:00:00.0 | 80000.0    | 0             | Graduate Degree | S
>>> | F          | Senior Management |
>>> | 2           | Derrick Whelply | Derrick    | Whelply    | 2
>>>  | VP Country Manager | 0          | 1             | 1915-07-03 |
>>> 1994-12-01 00:00:00.0 | 40000.0    | 1             | Graduate Degree | M
>>>           | M          | Senior Management |
>>>
>>> +-------------+------------+------------+------------+-------------+----------------+------------+---------------+------------+------------+------------+---------------+-----------------+----------------+------------+-----------------+
>>> 2 rows selected (0.39 seconds)
>>>
>>>
>>> 2. Join two mongodb tables would fail.
>>>
>>> SELECT t1.first_name, t2.last_name FROM mongo.employee.`empinfo` t1,
>>> mongo.employee.`empinfo` t2 where t1.`employee_id` = t2.`employee_id` limit
>>> 1;
>>> Query failed: Failure while setting up Foreman. Internal error: while
>>> converting `t1`.`employee_id` = `t2`.`employee_id`
>>> [39eb6c88-fd21-4514-8903-48d99210b88d]
>>>
>>> 3. Join a mongodb table with a table with other storage engine would
>>> fail with CanNotPlanException:
>>>
>>> SELECT t1.first_name, t2.last_name FROM mongo.employee.`empinfo` t1,
>>> mongo.employee.`empinfo` t2 where t1.`employee_id` = t2.`employee_id` limit
>>> 1;
>>> Query failed: Failure while setting up Foreman. Internal error: while
>>> converting `t1`.`employee_id` = `t2`.`employee_id`
>>> [39eb6c88-fd21-4514-8903-48d99210b88d]
>>>
>>> Error: exception while executing query: Failure while trying to get next
>>> result batch. (state=,code=0)
>>> 0: jdbc:drill:zk=local> SELECT t1.first_name, t1.last_name FROM
>>> mongo.employee.`empinfo` as t1, cp.`employee.json` t2 where t1.employee_id
>>> = t2.employee_id limit 10;
>>> Query failed: Failure while parsing sql. Node
>>> [rel#2496:Subset#5.LOGICAL.ANY([]).[]] could not be implemented; planner
>>> state:
>>>
>>> Root: rel#2496:Subset#5.LOGICAL.ANY([]).[]
>>> Original rel:
>>> ......
>>>
>>> 4. Select *, regular_column from mongodb would return the regular_column
>>> as null.
>>>
>>> 0: jdbc:drill:zk=local> SELECT first_name FROM mongo.employee.`empinfo`
>>> limit 2;
>>> +------------+
>>> | first_name |
>>> +------------+
>>> | Steve      |
>>> | Mary       |
>>> +------------+
>>> 2 rows selected (0.084 seconds)
>>> 0: jdbc:drill:zk=local> SELECT *, first_name FROM
>>> mongo.employee.`empinfo` limit 2;
>>> +------------+------------+
>>> |     *      | first_name |
>>> +------------+------------+
>>> | { "employee_id" : 1101 , "full_name" : "Steve Eurich" , "first_name" :
>>> "Steve" , "last_name" : "Eurich" , "position_id" : 16 , "position" : "Store
>>> T" , "isFTE" : true} | null       |
>>> | { "employee_id" : 1102 , "full_name" : "Mary Pierson" , "first_name" :
>>> "Mary" , "last_name" : "Pierson" , "position_id" : 16 , "position" : "Store
>>> T" , "isFTE" : true} | null       |
>>> +------------+------------+
>>>
>>>
>>>
>>> I think it would be fine to fix those issues in the next release.
>>>
>>>
>>> PS: could you please re-build a patch after rebasing on the recent
>>> master branch?
>>>
>>> - Jinfeng Ni
>>>
>>>
>>> On Sept. 24, 2014, 11:06 a.m., Anil Kumar B wrote:
>>> >
>>> > -----------------------------------------------------------
>>> > This is an automatically generated e-mail. To reply, visit:
>>> > https://reviews.apache.org/r/25996/
>>> > -----------------------------------------------------------
>>> >
>>> > (Updated Sept. 24, 2014, 11:06 a.m.)
>>> >
>>> >
>>> > Review request for drill, Aditya Kishore, Jacques Nadeau, and Kamesh B.
>>> >
>>> >
>>> > Repository: drill-git
>>> >
>>> >
>>> > Description
>>> > -------
>>> >
>>> > Mongo storage plugin support: The features which we implemented as
>>> part of this is as follows.
>>> > 1) Support for sharded(chunk wise), shared-replicated(chunk wise),
>>> replicated, stand-alone
>>> > 2) Predicate pushdown
>>> > 3) Mongo PStore
>>> >
>>> > MongoRecordReader uses JsonReaderWithState in the case of non-star
>>> queries.
>>> >
>>> >
>>> > Diffs
>>> > -----
>>> >
>>> >   contrib/pom.xml 728038a
>>> >   contrib/storage-mongo/pom.xml PRE-CREATION
>>> >
>>>  
>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/DrillMongoConstants.java
>>> PRE-CREATION
>>> >
>>>  
>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoCnxnManager.java
>>> PRE-CREATION
>>> >
>>>  
>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoCompareFunctionProcessor.java
>>> PRE-CREATION
>>> >
>>>  
>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoFilterBuilder.java
>>> PRE-CREATION
>>> >
>>>  
>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoGroupScan.java
>>> PRE-CREATION
>>> >
>>>  
>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoPushDownFilterForScan.java
>>> PRE-CREATION
>>> >
>>>  
>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoRecordReader.java
>>> PRE-CREATION
>>> >
>>>  
>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoScanBatchCreator.java
>>> PRE-CREATION
>>> >
>>>  
>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoScanSpec.java
>>> PRE-CREATION
>>> >
>>>  
>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoStoragePlugin.java
>>> PRE-CREATION
>>> >
>>>  
>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoStoragePluginConfig.java
>>> PRE-CREATION
>>> >
>>>  
>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoSubScan.java
>>> PRE-CREATION
>>> >
>>>  
>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoUtils.java
>>> PRE-CREATION
>>> >
>>>  
>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/common/ChunkInfo.java
>>> PRE-CREATION
>>> >
>>>  
>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/common/MongoCompareOp.java
>>> PRE-CREATION
>>> >
>>>  
>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/config/MongoPStore.java
>>> PRE-CREATION
>>> >
>>>  
>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/config/MongoPStoreProvider.java
>>> PRE-CREATION
>>> >
>>>  
>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/schema/MongoDatabaseSchema.java
>>> PRE-CREATION
>>> >
>>>  
>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/schema/MongoSchemaFactory.java
>>> PRE-CREATION
>>> >
>>>  contrib/storage-mongo/src/main/resources/bootstrap-storage-plugins.json
>>> PRE-CREATION
>>> >   contrib/storage-mongo/src/main/resources/drill-module.conf
>>> PRE-CREATION
>>> >
>>>  
>>> contrib/storage-mongo/src/test/java/org/apache/drill/exec/store/mongo/TestMongoChunkAssignment.java
>>> PRE-CREATION
>>> >   distribution/pom.xml cd5df0d
>>> >   distribution/src/assemble/bin.xml 86e3802
>>> >
>>>  exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java
>>> 933bfbe
>>> >
>>>  
>>> exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
>>> 4fa61e1
>>> >
>>>  
>>> exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/fn/JsonReader.java
>>> 4e12b8b
>>> >
>>>  
>>> exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/fn/JsonReaderWithState.java
>>> ef995f8
>>> >
>>> > Diff: https://reviews.apache.org/r/25996/diff/
>>> >
>>> >
>>> > Testing
>>> > -------
>>> >
>>> > 1) Tested various set of queries on sharded, replicated and
>>> stand-alone modes.
>>> >
>>> > 2) Test Environment details: We created mongo cluster with 2 shards
>>> with a collections consists of 35 chunks(18 chunks are one shard and
>>> remaining chunks on on other shard). Below are the few queries which we
>>> tested in all the environments.
>>> >
>>> >     a) SELECT * FROM mongo.employee.`empinfo` limit 10;
>>> >
>>> >       b) SELECT first_name, last_name FROM mongo.employee.`empinfo`
>>> limit 10;
>>> >
>>> >       c) SELECT first_name, last_name FROM mongo.employee.`empinfo`
>>> where employee_id = 1111;
>>> >
>>> >       d) SELECT * FROM mongo.employee.`empinfo` where full_name =
>>> 'Phil Munoz';
>>> >
>>> >       e) SELECT first_name, last_name, position_id FROM
>>> mongo.employee.`empinfo` where employee_id = 1111  OR position_id = 16;
>>> >
>>> >       g) SELECT first_name, last_name FROM mongo.employee.`empinfo`
>>> where isFTE = true;
>>> >
>>> >       h) SELECT first_name, last_name, position_id FROM
>>> mongo.employee.`empinfo` where employee_id = 1107  AND position_id = 17 AND
>>> last_name = 'Yonce';
>>> >
>>> >
>>> > 3) PStore functionality not fully tested.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > Anil Kumar B
>>> >
>>> >
>>>
>>>
>>
>
>
> --
> Kamesh.
>

Reply via email to