Anil That totally explains it. Thanks for checking. The queries are working now. Given that this is expected behavior for all_text_mode, the JIRA I filed can be resolved.
-thanks Neeraja On Tue, Sep 30, 2014 at 9:38 PM, AnilKumar B <[email protected]> wrote: > Hi Neeraja, > > It seems you are enabled "store.mongo.all_text_mode = true", that's why > aggregations are not working. This seems to be expected and we are also > discussing same on https://issues.apache.org/jira/browse/DRILL-1460 > > The same, I updated on https://issues.apache.org/jira/browse/DRILL-1475. > Please let us know your comments. > > > Thanks & Regards, > B Anil Kumar. > > On Wed, Oct 1, 2014 at 5:56 AM, AnilKumar B <[email protected]> wrote: > >> Hi Neeraja, >> >> Since from last 1 week, there is no code changes. We will look into this >> issue. If possible, can you please raise jira for this? >> >> As of now we are working on following issues. >> 1) Joins on (mongo vs mongo) and (mongo vs other storages) >> 2) select *, (some other fields) (Jinfeng's review comment) >> >> Next 10 days, we both are on vacation, so once we come back, we will >> resolve these issues. >> >> >> Thanks & Regards, >> B Anil Kumar. >> >> On Wed, Oct 1, 2014 at 5:03 AM, Neeraja Rentachintala < >> [email protected]> wrote: >> >>> Btw I got the dataset JSON from >>> http://docs.mongodb.org/manual/tutorial/aggregation-zip-code-data-set/ >>> >>> On Tue, Sep 30, 2014 at 4:30 PM, Neeraja Rentachintala < >>> [email protected]> wrote: >>> >>>> Hi Anil, Kamesh >>>> >>>> I was trying MongoDB plugin with the following queries and all of them >>>> succeeded. This is using the zip codes dataset. >>>> However intermittently (now consistently) , these are failing. I built >>>> Kamesh branch and also tried from the latest 0.6 master (which merged the >>>> plugin). >>>> Do you have any insights into this. I can file JIRA if you like. >>>> >>>> select state,sum(pop) from zipcodes group by state having sum(pop) > >>>> 10000000 order by sum(pop) desc; >>>> >>>> select state,city,avg(pop) from zipcodes group by state, city; >>>> >>>> select city, sum(pop) from zipcodes where state is not null group by >>>> city order by sum(pop) desc limit 1; >>>> >>>> select city, sum(pop) from zipcodes where state is not null group by >>>> city order by sum(pop) asc limit 1; >>>> >>>> --------------- >>>> >>>> >>>> Errors >>>> >>>> ------------------- >>>> >>>> select sum(pop) from zipcodes where city=‘CHICAGO’; >>>> >>>> 0: jdbc:drill:zk=local> select state, city, sum(pop) from zipcodes >>>> group by state, city order by sum(pop) limit 5; >>>> >>>> Query failed: Failure while running fragment. Invalid value for >>>> boolean: 15338 [1366240a-125d-4646-962a-1734f84b03b3] >>>> >>>> >>>> Error: exception while executing query: Failure while trying to get >>>> next result batch. (state=,code=0) >>>> >>>> 0: jdbc:drill:zk=local> select state, city, sum(pop) from zipcodes >>>> where state is not null and city is not null group by state, city order by >>>> sum(pop) limit 5; >>>> >>>> Query failed: Failure while running fragment. Invalid value for >>>> boolean: 15338 [53a1c241-82f0-4413-aa09-39f851d7209a] >>>> >>>> >>>> Error: exception while executing query: Failure while trying to get >>>> next result batch. (state=,code=0) >>>> >>>> 0: jdbc:drill:zk=local> select state, city, sum(pop) from zipcodes >>>> group by state,city order by sum(pop) asc limit 1; >>>> >>>> Query failed: Failure while running fragment. Invalid value for >>>> boolean: 15338 [911f48f7-019e-42f2-b0af-1e255416ce76] >>>> >>>> >>>> Error: exception while executing query: Failure while trying to get >>>> next result batch. (state=,code=0) >>>> >>>> 0: jdbc:drill:zk=local> select state,city,avg(pop) from zipcodes group >>>> by state, city; >>>> >>>> Query failed: Failure while running fragment. Invalid value for >>>> boolean: 15338 [54c5e480-972a-49b9-860e-60d4a7429366] >>>> >>>> >>>> Error: exception while executing query: Failure while trying to get >>>> next result batch. (state=,code=0) >>>> >>>> 0: jdbc:drill:zk=local> select city, sum(pop) from zipcodes group by >>>> city order by sum(pop) asc limit 1; >>>> >>>> Query failed: Failure while running fragment. Invalid value for >>>> boolean: 15338 [b5374f4a-3ee6-47ac-9dc1-fcebed555518] >>>> >>>> >>>> Error: exception while executing query: Failure while trying to get >>>> next result batch. (state=,code=0) >>>> >>>> 0: jdbc:drill:zk=local> select state,sum(pop) from zipcodes group by >>>> state having sum(pop) > 10000000; >>>> >>>> Query failed: Failure while running fragment. Invalid value for >>>> boolean: 15338 [5875169d-dcb8-419f-991e-d38d41dddcbb] >>>> >>>> >>>> Error: exception while executing query: Failure while trying to get >>>> next result batch. (state=,code=0) >>>> >>>> On Sat, Sep 27, 2014 at 1:07 AM, Kamesh <[email protected]> >>>> wrote: >>>> >>>>> Thanks Jinfeng & Neeraja for looking into this. >>>>> We will look into the above mentioned issues. >>>>> >>>>> >>>>> >>>>> On Sat, Sep 27, 2014 at 8:28 AM, Neeraja Rentachintala < >>>>> [email protected]> wrote: >>>>> >>>>>> I have played with the plugin as well today and overall its very good. >>>>>> >>>>>> I tried the queries >>>>>> http://docs.mongodb.org/manual/tutorial/aggregation-zip-code-data-set/ >>>>>> on the zip code dataset and all the aggregate queries worked. >>>>>> >>>>>> >>>>>> ----------- >>>>>> >>>>>> select sum(pop) from zipcodes where city='SEATTLE’; >>>>>> >>>>>> select state, city, sum(pop) from zipcodes group by state,city order >>>>>> by sum(pop) asc limit 1; >>>>>> >>>>>> select state,city,avg(pop) from zipcodes group by state, city; >>>>>> >>>>>> select city, sum(pop) from zipcodes group by city order by sum(pop) >>>>>> asc limit 1; >>>>>> >>>>>> select state,sum(pop) from zipcodes group by state having sum(pop) > >>>>>> 10000000; >>>>>> >>>>>> >>>>>> ---------- >>>>>> >>>>>> >>>>>> I however noticed issues with querying repeating elements (used USDA >>>>>> nutrition dataset), especially more than one level nested as well as >>>>>> JOINs >>>>>> (example queries are below) >>>>>> >>>>>> ------------------ >>>>>> >>>>>> 0: jdbc:drill:zk=local> SELECT t1.first_name FROM >>>>>> mongo.employee.`empinfo` t1 JOIN mongo.employee.`empinfo` t2 ON >>>>>> t1.`employee_id` = t2.`employee_id`; >>>>>> >>>>>> Query failed: Failure while setting up Foreman. Internal error: Error >>>>>> while applying rule DrillPushProjIntoScan, args >>>>>> [rel#12606:ProjectRel.NONE.ANY([]).[](child=rel#12598:Subset#0.ENUMERABLE.ANY([]).[],employee_id=$1,first_name=$2), >>>>>> rel#12594:EnumerableTableAccessRel.ENUMERABLE.ANY([]).[](table=[mongo, >>>>>> employee, empinfo])] [08f4eedd-f5c9-4ebf-8d5b-d9249b79ca32] >>>>>> >>>>>> >>>>>> 0: jdbc:drill:zk=local> select t.nutrients from mongo.usda.nutrition >>>>>> t limit 1; >>>>>> >>>>>> Query failed: Screen received stop request sent. You tried to write a >>>>>> BigInt type when you are using a ValueWriter of type >>>>>> NullableFloat8WriterImpl. [dc44e277-1b1d-4f00-b60e-9f06b883e7c5] >>>>>> >>>>>> >>>>>> Error: exception while executing query: Failure while trying to get >>>>>> next result batch. (state=,code=0) >>>>>> >>>>>> 0: jdbc:drill:zk=local> select t.nutrients[0].units from >>>>>> mongo.usda.nutrition t limit 1; >>>>>> >>>>>> Query failed: Screen received stop request sent. You tried to write a >>>>>> BigInt type when you are using a ValueWriter of type >>>>>> NullableFloat8WriterImpl. [a285c85e-4607-48fc-97af-41b5726459e2] >>>>>> >>>>>> >>>>>> Error: exception while executing query: Failure while trying to get >>>>>> next result batch. (state=,code=0) >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Sep 26, 2014 at 6:07 PM, Jinfeng Ni <[email protected]> wrote: >>>>>> >>>>>>> >>>>>>> ----------------------------------------------------------- >>>>>>> This is an automatically generated e-mail. To reply, visit: >>>>>>> https://reviews.apache.org/r/25996/#review54756 >>>>>>> ----------------------------------------------------------- >>>>>>> >>>>>>> Ship it! >>>>>>> >>>>>>> >>>>>>> I did not do a detail code review; let that task to Steven. I mainly >>>>>>> played with this Mongo plugin. Overall it looks good. >>>>>>> >>>>>>> Basically, I start a mongodb instance, import the data, and run >>>>>>> several single table queryies, and all of them work perfectly. >>>>>>> >>>>>>> Some issues I saw when playing around : >>>>>>> >>>>>>> 1. The result of select * seems not the expect answer : it would >>>>>>> return a map containing all the columns: >>>>>>> >>>>>>> SELECT * FROM mongo.employee.`empinfo` limit 2; >>>>>>> +------------+ >>>>>>> | * | >>>>>>> +------------+ >>>>>>> | { "employee_id" : 1101 , "full_name" : "Steve Eurich" , >>>>>>> "first_name" : "Steve" , "last_name" : "Eurich" , "position_id" : 16 , >>>>>>> "position" : "Store T" , "isFTE" : true} | >>>>>>> | { "employee_id" : 1102 , "full_name" : "Mary Pierson" , >>>>>>> "first_name" : "Mary" , "last_name" : "Pierson" , "position_id" : 16 , >>>>>>> "position" : "Store T" , "isFTE" : true} | >>>>>>> +------------+ >>>>>>> 2 rows selected (0.084 seconds) >>>>>>> >>>>>>> In contrast, here is the result when Drill queries a .json file: >>>>>>> >>>>>>> select * from cp.`employee.json` limit 2; >>>>>>> >>>>>>> +-------------+------------+------------+------------+-------------+----------------+------------+---------------+------------+------------+------------+---------------+-----------------+----------------+------------+-----------------+ >>>>>>> | employee_id | full_name | first_name | last_name | position_id | >>>>>>> position_title | store_id | department_id | birth_date | hire_date | >>>>>>> salary | supervisor_id | education_level | marital_status | gender >>>>>>> | >>>>>>> management_role | >>>>>>> >>>>>>> +-------------+------------+------------+------------+-------------+----------------+------------+---------------+------------+------------+------------+---------------+-----------------+----------------+------------+-----------------+ >>>>>>> | 1 | Sheri Nowmer | Sheri | Nowmer | 1 >>>>>>> | President | 0 | 1 | 1961-08-26 | 1994-12-01 >>>>>>> 00:00:00.0 | 80000.0 | 0 | Graduate Degree | S >>>>>>> | F | Senior Management | >>>>>>> | 2 | Derrick Whelply | Derrick | Whelply | 2 >>>>>>> | VP Country Manager | 0 | 1 | 1915-07-03 | >>>>>>> 1994-12-01 00:00:00.0 | 40000.0 | 1 | Graduate Degree | M >>>>>>> | M | Senior Management | >>>>>>> >>>>>>> +-------------+------------+------------+------------+-------------+----------------+------------+---------------+------------+------------+------------+---------------+-----------------+----------------+------------+-----------------+ >>>>>>> 2 rows selected (0.39 seconds) >>>>>>> >>>>>>> >>>>>>> 2. Join two mongodb tables would fail. >>>>>>> >>>>>>> SELECT t1.first_name, t2.last_name FROM mongo.employee.`empinfo` t1, >>>>>>> mongo.employee.`empinfo` t2 where t1.`employee_id` = t2.`employee_id` >>>>>>> limit >>>>>>> 1; >>>>>>> Query failed: Failure while setting up Foreman. Internal error: >>>>>>> while converting `t1`.`employee_id` = `t2`.`employee_id` >>>>>>> [39eb6c88-fd21-4514-8903-48d99210b88d] >>>>>>> >>>>>>> 3. Join a mongodb table with a table with other storage engine would >>>>>>> fail with CanNotPlanException: >>>>>>> >>>>>>> SELECT t1.first_name, t2.last_name FROM mongo.employee.`empinfo` t1, >>>>>>> mongo.employee.`empinfo` t2 where t1.`employee_id` = t2.`employee_id` >>>>>>> limit >>>>>>> 1; >>>>>>> Query failed: Failure while setting up Foreman. Internal error: >>>>>>> while converting `t1`.`employee_id` = `t2`.`employee_id` >>>>>>> [39eb6c88-fd21-4514-8903-48d99210b88d] >>>>>>> >>>>>>> Error: exception while executing query: Failure while trying to get >>>>>>> next result batch. (state=,code=0) >>>>>>> 0: jdbc:drill:zk=local> SELECT t1.first_name, t1.last_name FROM >>>>>>> mongo.employee.`empinfo` as t1, cp.`employee.json` t2 where >>>>>>> t1.employee_id >>>>>>> = t2.employee_id limit 10; >>>>>>> Query failed: Failure while parsing sql. Node >>>>>>> [rel#2496:Subset#5.LOGICAL.ANY([]).[]] could not be implemented; planner >>>>>>> state: >>>>>>> >>>>>>> Root: rel#2496:Subset#5.LOGICAL.ANY([]).[] >>>>>>> Original rel: >>>>>>> ...... >>>>>>> >>>>>>> 4. Select *, regular_column from mongodb would return the >>>>>>> regular_column as null. >>>>>>> >>>>>>> 0: jdbc:drill:zk=local> SELECT first_name FROM >>>>>>> mongo.employee.`empinfo` limit 2; >>>>>>> +------------+ >>>>>>> | first_name | >>>>>>> +------------+ >>>>>>> | Steve | >>>>>>> | Mary | >>>>>>> +------------+ >>>>>>> 2 rows selected (0.084 seconds) >>>>>>> 0: jdbc:drill:zk=local> SELECT *, first_name FROM >>>>>>> mongo.employee.`empinfo` limit 2; >>>>>>> +------------+------------+ >>>>>>> | * | first_name | >>>>>>> +------------+------------+ >>>>>>> | { "employee_id" : 1101 , "full_name" : "Steve Eurich" , >>>>>>> "first_name" : "Steve" , "last_name" : "Eurich" , "position_id" : 16 , >>>>>>> "position" : "Store T" , "isFTE" : true} | null | >>>>>>> | { "employee_id" : 1102 , "full_name" : "Mary Pierson" , >>>>>>> "first_name" : "Mary" , "last_name" : "Pierson" , "position_id" : 16 , >>>>>>> "position" : "Store T" , "isFTE" : true} | null | >>>>>>> +------------+------------+ >>>>>>> >>>>>>> >>>>>>> >>>>>>> I think it would be fine to fix those issues in the next release. >>>>>>> >>>>>>> >>>>>>> PS: could you please re-build a patch after rebasing on the recent >>>>>>> master branch? >>>>>>> >>>>>>> - Jinfeng Ni >>>>>>> >>>>>>> >>>>>>> On Sept. 24, 2014, 11:06 a.m., Anil Kumar B wrote: >>>>>>> > >>>>>>> > ----------------------------------------------------------- >>>>>>> > This is an automatically generated e-mail. To reply, visit: >>>>>>> > https://reviews.apache.org/r/25996/ >>>>>>> > ----------------------------------------------------------- >>>>>>> > >>>>>>> > (Updated Sept. 24, 2014, 11:06 a.m.) >>>>>>> > >>>>>>> > >>>>>>> > Review request for drill, Aditya Kishore, Jacques Nadeau, and >>>>>>> Kamesh B. >>>>>>> > >>>>>>> > >>>>>>> > Repository: drill-git >>>>>>> > >>>>>>> > >>>>>>> > Description >>>>>>> > ------- >>>>>>> > >>>>>>> > Mongo storage plugin support: The features which we implemented as >>>>>>> part of this is as follows. >>>>>>> > 1) Support for sharded(chunk wise), shared-replicated(chunk wise), >>>>>>> replicated, stand-alone >>>>>>> > 2) Predicate pushdown >>>>>>> > 3) Mongo PStore >>>>>>> > >>>>>>> > MongoRecordReader uses JsonReaderWithState in the case of non-star >>>>>>> queries. >>>>>>> > >>>>>>> > >>>>>>> > Diffs >>>>>>> > ----- >>>>>>> > >>>>>>> > contrib/pom.xml 728038a >>>>>>> > contrib/storage-mongo/pom.xml PRE-CREATION >>>>>>> > >>>>>>> >>>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/DrillMongoConstants.java >>>>>>> PRE-CREATION >>>>>>> > >>>>>>> >>>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoCnxnManager.java >>>>>>> PRE-CREATION >>>>>>> > >>>>>>> >>>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoCompareFunctionProcessor.java >>>>>>> PRE-CREATION >>>>>>> > >>>>>>> >>>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoFilterBuilder.java >>>>>>> PRE-CREATION >>>>>>> > >>>>>>> >>>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoGroupScan.java >>>>>>> PRE-CREATION >>>>>>> > >>>>>>> >>>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoPushDownFilterForScan.java >>>>>>> PRE-CREATION >>>>>>> > >>>>>>> >>>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoRecordReader.java >>>>>>> PRE-CREATION >>>>>>> > >>>>>>> >>>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoScanBatchCreator.java >>>>>>> PRE-CREATION >>>>>>> > >>>>>>> >>>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoScanSpec.java >>>>>>> PRE-CREATION >>>>>>> > >>>>>>> >>>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoStoragePlugin.java >>>>>>> PRE-CREATION >>>>>>> > >>>>>>> >>>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoStoragePluginConfig.java >>>>>>> PRE-CREATION >>>>>>> > >>>>>>> >>>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoSubScan.java >>>>>>> PRE-CREATION >>>>>>> > >>>>>>> >>>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoUtils.java >>>>>>> PRE-CREATION >>>>>>> > >>>>>>> >>>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/common/ChunkInfo.java >>>>>>> PRE-CREATION >>>>>>> > >>>>>>> >>>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/common/MongoCompareOp.java >>>>>>> PRE-CREATION >>>>>>> > >>>>>>> >>>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/config/MongoPStore.java >>>>>>> PRE-CREATION >>>>>>> > >>>>>>> >>>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/config/MongoPStoreProvider.java >>>>>>> PRE-CREATION >>>>>>> > >>>>>>> >>>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/schema/MongoDatabaseSchema.java >>>>>>> PRE-CREATION >>>>>>> > >>>>>>> >>>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/schema/MongoSchemaFactory.java >>>>>>> PRE-CREATION >>>>>>> > >>>>>>> contrib/storage-mongo/src/main/resources/bootstrap-storage-plugins.json >>>>>>> PRE-CREATION >>>>>>> > contrib/storage-mongo/src/main/resources/drill-module.conf >>>>>>> PRE-CREATION >>>>>>> > >>>>>>> >>>>>>> contrib/storage-mongo/src/test/java/org/apache/drill/exec/store/mongo/TestMongoChunkAssignment.java >>>>>>> PRE-CREATION >>>>>>> > distribution/pom.xml cd5df0d >>>>>>> > distribution/src/assemble/bin.xml 86e3802 >>>>>>> > >>>>>>> exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java >>>>>>> 933bfbe >>>>>>> > >>>>>>> >>>>>>> exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java >>>>>>> 4fa61e1 >>>>>>> > >>>>>>> >>>>>>> exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/fn/JsonReader.java >>>>>>> 4e12b8b >>>>>>> > >>>>>>> >>>>>>> exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/fn/JsonReaderWithState.java >>>>>>> ef995f8 >>>>>>> > >>>>>>> > Diff: https://reviews.apache.org/r/25996/diff/ >>>>>>> > >>>>>>> > >>>>>>> > Testing >>>>>>> > ------- >>>>>>> > >>>>>>> > 1) Tested various set of queries on sharded, replicated and >>>>>>> stand-alone modes. >>>>>>> > >>>>>>> > 2) Test Environment details: We created mongo cluster with 2 >>>>>>> shards with a collections consists of 35 chunks(18 chunks are one shard >>>>>>> and >>>>>>> remaining chunks on on other shard). Below are the few queries which we >>>>>>> tested in all the environments. >>>>>>> > >>>>>>> > a) SELECT * FROM mongo.employee.`empinfo` limit 10; >>>>>>> > >>>>>>> > b) SELECT first_name, last_name FROM >>>>>>> mongo.employee.`empinfo` limit 10; >>>>>>> > >>>>>>> > c) SELECT first_name, last_name FROM >>>>>>> mongo.employee.`empinfo` where employee_id = 1111; >>>>>>> > >>>>>>> > d) SELECT * FROM mongo.employee.`empinfo` where full_name = >>>>>>> 'Phil Munoz'; >>>>>>> > >>>>>>> > e) SELECT first_name, last_name, position_id FROM >>>>>>> mongo.employee.`empinfo` where employee_id = 1111 OR position_id = 16; >>>>>>> > >>>>>>> > g) SELECT first_name, last_name FROM >>>>>>> mongo.employee.`empinfo` where isFTE = true; >>>>>>> > >>>>>>> > h) SELECT first_name, last_name, position_id FROM >>>>>>> mongo.employee.`empinfo` where employee_id = 1107 AND position_id = 17 >>>>>>> AND >>>>>>> last_name = 'Yonce'; >>>>>>> > >>>>>>> > >>>>>>> > 3) PStore functionality not fully tested. >>>>>>> > >>>>>>> > >>>>>>> > Thanks, >>>>>>> > >>>>>>> > Anil Kumar B >>>>>>> > >>>>>>> > >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Kamesh. >>>>> >>>> >>>> >>> >> >
