[jira] [Commented] (CASSANDRA-6454) Pig support for hadoop CqlInputFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14070323#comment-14070323 ] Brandon Williams commented on CASSANDRA-6454: - I'm +1 on switching to M3 in principle, but it seems highly unlikely that we're going to catch any problems here with the handful of records we use in the pig tests. bq. I'm -0 on adding extra yaml files, fwiw. I agree, this doesn't seem like the time or place to be worried about BOP vs M3. Let's just take that out and use the default yaml for now. > Pig support for hadoop CqlInputFormat > - > > Key: CASSANDRA-6454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6454 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Alex Liu >Assignee: Alex Liu > Fix For: 2.0.10 > > Attachments: 6454-2.0-branch.txt, 6454-v2-2.0-branch.txt, > 6454-v3-2.0-branch.txt, 6454-v3-2.1-branch.txt > > > CASSANDRA-6311 adds new CqlInputFormat, we need add the Pig support for it -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6454) Pig support for hadoop CqlInputFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14069908#comment-14069908 ] Jonathan Ellis commented on CASSANDRA-6454: --- I'm -0 on adding extra yaml files, fwiw. > Pig support for hadoop CqlInputFormat > - > > Key: CASSANDRA-6454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6454 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Alex Liu >Assignee: Alex Liu > Fix For: 2.0.10 > > Attachments: 6454-2.0-branch.txt, 6454-v2-2.0-branch.txt, > 6454-v3-2.0-branch.txt, 6454-v3-2.1-branch.txt > > > CASSANDRA-6311 adds new CqlInputFormat, we need add the Pig support for it -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6454) Pig support for hadoop CqlInputFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14069620#comment-14069620 ] Alex Liu commented on CASSANDRA-6454: - Most of Cassandra deployment uses Murmur3Partitioner, testing on Murmur3Partitioner covers the general case. Some unit tests still use old ByteOrderedPartitioner, so just update the cassanra.yaml to Murmur3Partitioner breaks other unit tests. That's the reason I create a new yaml file. > Pig support for hadoop CqlInputFormat > - > > Key: CASSANDRA-6454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6454 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Alex Liu >Assignee: Alex Liu > Fix For: 2.0.10 > > Attachments: 6454-2.0-branch.txt, 6454-v2-2.0-branch.txt, > 6454-v3-2.0-branch.txt, 6454-v3-2.1-branch.txt > > > CASSANDRA-6311 adds new CqlInputFormat, we need add the Pig support for it -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6454) Pig support for hadoop CqlInputFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14069459#comment-14069459 ] Brandon Williams commented on CASSANDRA-6454: - Is there a reason we need to switch to it for this? (not that I disagree with it at all) > Pig support for hadoop CqlInputFormat > - > > Key: CASSANDRA-6454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6454 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Alex Liu >Assignee: Alex Liu > Fix For: 2.0.10 > > Attachments: 6454-2.0-branch.txt, 6454-v2-2.0-branch.txt, > 6454-v3-2.0-branch.txt, 6454-v3-2.1-branch.txt > > > CASSANDRA-6311 adds new CqlInputFormat, we need add the Pig support for it -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6454) Pig support for hadoop CqlInputFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14069450#comment-14069450 ] Alex Liu commented on CASSANDRA-6454: - cassandra_pig.yaml uses Murmur3Partitioner instead of ByteOrderedPartitioner. ByteOrderedPartitioner is used for other unit tests. > Pig support for hadoop CqlInputFormat > - > > Key: CASSANDRA-6454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6454 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Alex Liu >Assignee: Alex Liu > Fix For: 2.0.10 > > Attachments: 6454-2.0-branch.txt, 6454-v2-2.0-branch.txt, > 6454-v3-2.0-branch.txt, 6454-v3-2.1-branch.txt > > > CASSANDRA-6311 adds new CqlInputFormat, we need add the Pig support for it -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6454) Pig support for hadoop CqlInputFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14066637#comment-14066637 ] Brandon Williams commented on CASSANDRA-6454: - Do we actually need cassandra_pig.yaml? It seems like all we're doing is turning on the native proto which should be fine in the standard config. > Pig support for hadoop CqlInputFormat > - > > Key: CASSANDRA-6454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6454 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Alex Liu >Assignee: Alex Liu > Fix For: 2.0.10 > > Attachments: 6454-2.0-branch.txt, 6454-v2-2.0-branch.txt, > 6454-v3-2.0-branch.txt, 6454-v3-2.1-branch.txt > > > CASSANDRA-6311 adds new CqlInputFormat, we need add the Pig support for it -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6454) Pig support for hadoop CqlInputFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14059300#comment-14059300 ] Alex Liu commented on CASSANDRA-6454: - I comments out that test case, we can fix it later in another ticket > Pig support for hadoop CqlInputFormat > - > > Key: CASSANDRA-6454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6454 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Alex Liu >Assignee: Alex Liu > Fix For: 2.0.10 > > Attachments: 6454-2.0-branch.txt, 6454-v2-2.0-branch.txt, > 6454-v3-2.0-branch.txt, 6454-v3-2.1-branch.txt > > > CASSANDRA-6311 adds new CqlInputFormat, we need add the Pig support for it -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6454) Pig support for hadoop CqlInputFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14059275#comment-14059275 ] Alex Liu commented on CASSANDRA-6454: - I also got a lib conflict issue as following {code} [junit] - --- [junit] Testcase: testCassandraStorageCompositeColumnCF(org.apache.cassandra.pig.ThriftColumnFamilyTest): Caused an ERROR [junit] org.antlr.runtime.tree.BaseTree.insertChild(ILjava/lang/Object;)V [junit] java.lang.NoSuchMethodError: org.antlr.runtime.tree.BaseTree.insertChild(ILjava/lang/Object;)V [junit] at org.apache.pig.parser.QueryParser.paren_expr(QueryParser.java:17532) [junit] at org.apache.pig.parser.QueryParser.cast_expr(QueryParser.java:17005) [junit] at org.apache.pig.parser.QueryParser.multi_expr(QueryParser.java:15679) [junit] at org.apache.pig.parser.QueryParser.expr(QueryParser.java:15568) [junit] at org.apache.pig.parser.QueryParser.unary_cond(QueryParser.java:15324) [junit] at org.apache.pig.parser.QueryParser.not_cond(QueryParser.java:14951) [junit] at org.apache.pig.parser.QueryParser.and_cond(QueryParser.java:14828) [junit] at org.apache.pig.parser.QueryParser.cond(QueryParser.java:14728) [junit] at org.apache.pig.parser.QueryParser.filter_clause(QueryParser.java:10509) [junit] at org.apache.pig.parser.QueryParser.op_clause(QueryParser.java:7092) [junit] at org.apache.pig.parser.QueryParser.general_statement(QueryParser.java:2314) [junit] at org.apache.pig.parser.QueryParser.statement(QueryParser.java:1579) [junit] at org.apache.pig.parser.QueryParser.query(QueryParser.java:395) [junit] at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:236) [junit] at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:179) [junit] at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678) [junit] at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1625) [junit] at org.apache.pig.PigServer.registerQuery(PigServer.java:577) [junit] at org.apache.pig.PigServer.registerQuery(PigServer.java:590) [junit] at org.apache.cassandra.pig.ThriftColumnFamilyTest.testCassandraStorageCompositeColumnCF(ThriftColumnFamilyTest.java:624) {code} Pig 0.12 is built with antlr 3.4. Cassandra uses antler 3.2 Other than that, CqlNativeStorage passes all the tests > Pig support for hadoop CqlInputFormat > - > > Key: CASSANDRA-6454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6454 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Alex Liu >Assignee: Alex Liu > Fix For: 2.0.10 > > Attachments: 6454-2.0-branch.txt, 6454-v2-2.0-branch.txt > > > CASSANDRA-6311 adds new CqlInputFormat, we need add the Pig support for it -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6454) Pig support for hadoop CqlInputFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14058995#comment-14058995 ] Brandon Williams commented on CASSANDRA-6454: - Yes, that's a bug in CPRR, I'm fine with moving away from it. > Pig support for hadoop CqlInputFormat > - > > Key: CASSANDRA-6454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6454 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Alex Liu >Assignee: Alex Liu > Fix For: 2.0.10 > > Attachments: 6454-2.0-branch.txt, 6454-v2-2.0-branch.txt > > > CASSANDRA-6311 adds new CqlInputFormat, we need add the Pig support for it -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6454) Pig support for hadoop CqlInputFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14058990#comment-14058990 ] Alex Liu commented on CASSANDRA-6454: - I got some test errors for compact tables using CqlStorage in CASSANDRA-7059, I hope change them to CqlNativeStorage fixes the issue. > Pig support for hadoop CqlInputFormat > - > > Key: CASSANDRA-6454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6454 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Alex Liu >Assignee: Alex Liu > Fix For: 2.0.10 > > Attachments: 6454-2.0-branch.txt, 6454-v2-2.0-branch.txt > > > CASSANDRA-6311 adds new CqlInputFormat, we need add the Pig support for it -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6454) Pig support for hadoop CqlInputFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14057745#comment-14057745 ] Brandon Williams commented on CASSANDRA-6454: - For 2.1, it's probably a good idea to switch all the tests to native. > Pig support for hadoop CqlInputFormat > - > > Key: CASSANDRA-6454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6454 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Alex Liu >Assignee: Alex Liu > Fix For: 2.0.10 > > Attachments: 6454-2.0-branch.txt, 6454-v2-2.0-branch.txt > > > CASSANDRA-6311 adds new CqlInputFormat, we need add the Pig support for it -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6454) Pig support for hadoop CqlInputFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14057583#comment-14057583 ] Brandon Williams commented on CASSANDRA-6454: - Looks good, though some of those params feel like YAGNI to me, but I guess if we already have them we might as well put them in. Can you post a 2.1 branch too? > Pig support for hadoop CqlInputFormat > - > > Key: CASSANDRA-6454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6454 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Alex Liu >Assignee: Alex Liu > Fix For: 2.0.10 > > Attachments: 6454-2.0-branch.txt, 6454-v2-2.0-branch.txt > > > CASSANDRA-6311 adds new CqlInputFormat, we need add the Pig support for it -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6454) Pig support for hadoop CqlInputFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011253#comment-14011253 ] Alex Liu commented on CASSANDRA-6454: - There are a few unit testing examples at CqlTableTest. input_cql needs encoded. {code} using org.apache.cassandra.hadoop.pig.CqlNativeStorage(); should be using CqlNativeStorage(); {code} > Pig support for hadoop CqlInputFormat > - > > Key: CASSANDRA-6454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6454 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Alex Liu >Assignee: Alex Liu > Fix For: 2.0.9 > > Attachments: 6454-2.0-branch.txt, 6454-v2-2.0-branch.txt > > > CASSANDRA-6311 adds new CqlInputFormat, we need add the Pig support for it -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6454) Pig support for hadoop CqlInputFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011193#comment-14011193 ] Shridhar commented on CASSANDRA-6454: - [~alexliu68] I have applied the new patch. After this am not able to load data from Cassandra. It throws me an exception Caused by: InvalidRequestException(why:Keyspace '' does not exist) at org.apache.cassandra.thrift.Cassandra$set_keyspace_result$set_keyspace_resultStandardScheme.read(Cassandra.java:8906) at org.apache.cassandra.thrift.Cassandra$set_keyspace_result$set_keyspace_resultStandardScheme.read(Cassandra.java:8892) at org.apache.cassandra.thrift.Cassandra$set_keyspace_result.read(Cassandra.java:8842) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_set_keyspace(Cassandra.java:599) at org.apache.cassandra.thrift.Cassandra$Client.set_keyspace(Cassandra.java:586) at org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.initSchema(AbstractCassandraStorage.java:493) Table schema : CREATE TABLE dogs ( block_id int, breed text, color text, short_hair boolean, PRIMARY KEY ((block_id, breed), color, short_hair)) My pig load function would be like this data = load 'cql://excelsior/dogs' using org.apache.cassandra.hadoop.pig.CqlNativeStorage(); Also tried with this a = load 'cql://excelsior/dogs?input_cql=select block_id,breed from excelsior.dogs where token(block_id,breed) > ? and token(block_id,breed) <= ? and block_id=5 and breed='Bulldog' ' using org.apache.cassandra.hadoop.pig.CqlNativeStorage(); Still same exception. Am i doing something wrong in syntax or ??? > Pig support for hadoop CqlInputFormat > - > > Key: CASSANDRA-6454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6454 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Alex Liu >Assignee: Alex Liu > Fix For: 2.0.9 > > Attachments: 6454-2.0-branch.txt, 6454-v2-2.0-branch.txt > > > CASSANDRA-6311 adds new CqlInputFormat, we need add the Pig support for it -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6454) Pig support for hadoop CqlInputFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14002525#comment-14002525 ] Alex Liu commented on CASSANDRA-6454: - The latest patch is attached. To use CqlNativeStorage, the following parameters need to be specified {code} input_cql= (It must in the following format * 1) select clause must include partition key columns (to calculate the progress based on the actual CF row processed) * 2) where clause must include token(partition_key1, ... , partition_keyn) > ? and * token(partition_key1, ... , partition_keyn) <= ? (in the right order) native_port= (If it's not default port) other parameters are [&native_port=][&core_conns=] [&max_conns=][&min_simult_reqs=][&max_simult_reqs=] [&native_timeout=][&native_read_timeout=][&rec_buff_size=] [&send_buff_size=][&solinger=][&tcp_nodelay=] [&reuse_address=] [&keep_alive=][&auth_provider=][&trust_store_path=] [&key_store_path=][&trust_store_password=] [&key_store_password=][&cipher_suites=][&input_cql=]] {code} > Pig support for hadoop CqlInputFormat > - > > Key: CASSANDRA-6454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6454 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Alex Liu >Assignee: Alex Liu > Fix For: 2.0.9 > > Attachments: 6454-2.0-branch.txt, 6454-v2-2.0-branch.txt > > > CASSANDRA-6311 adds new CqlInputFormat, we need add the Pig support for it -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6454) Pig support for hadoop CqlInputFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14002123#comment-14002123 ] Alex Liu commented on CASSANDRA-6454: - [~minishri] Do you enable native port 9042 on all the nodes? If the native port is different from 9042, set it in url native_port= > Pig support for hadoop CqlInputFormat > - > > Key: CASSANDRA-6454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6454 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Alex Liu >Assignee: Alex Liu > Fix For: 2.0.9 > > Attachments: 6454-2.0-branch.txt > > > CASSANDRA-6311 adds new CqlInputFormat, we need add the Pig support for it -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6454) Pig support for hadoop CqlInputFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13999765#comment-13999765 ] Shridhar commented on CASSANDRA-6454: - [~alexliu68] I tried to run pig script to load data with CqlNativeStorage but getting problems, looks like jar conflicts or may be something else can you please let me know the required jars and their version needed to run CqlNativeStorage . Below are the things i tried. 1.Applied this patch on top of Cassandra 2.0.07. 2.When i tried to run pig script with CqlNativeStorage it threw me "ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. com/datastax/driver/core/policies/LoadBalancingPolicy" exception. 3.Then added "cassandra-driver-core-2.0.1.jar" in my classpath after this i got an exception "ERROR org.apache.pig.tools.pigstats.SimplePigStats - ERROR: com.codahale.metrics.Metric" 4.Then added "metrics-core-3.0.2.jar". After adding this jar file i was able to run the job but failed and my hadoop log shows me this exception NOTE: metrics-core-2.2.0.jar already exists in C* 2.0.7 lib folder. "Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: slave1.tpgsi.com/.yyy.zzz.aa (com.datastax.driver.core.TransportException: [slave1/.yyy.zzz.aa] Error writing)) > Pig support for hadoop CqlInputFormat > - > > Key: CASSANDRA-6454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6454 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Alex Liu >Assignee: Alex Liu > Fix For: 2.0.9 > > Attachments: 6454-2.0-branch.txt > > > CASSANDRA-6311 adds new CqlInputFormat, we need add the Pig support for it -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6454) Pig support for hadoop CqlInputFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14000139#comment-14000139 ] Alex Liu commented on CASSANDRA-6454: - The patch is a little of off, I will update it to the latest code. > Pig support for hadoop CqlInputFormat > - > > Key: CASSANDRA-6454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6454 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Alex Liu >Assignee: Alex Liu > Fix For: 2.0.9 > > Attachments: 6454-2.0-branch.txt > > > CASSANDRA-6311 adds new CqlInputFormat, we need add the Pig support for it -- This message was sent by Atlassian JIRA (v6.2#6252)