[jira] [Commented] (CASSANDRA-7241) Pig test fails on 2.1 branch
[ https://issues.apache.org/jira/browse/CASSANDRA-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012221#comment-14012221 ] Sylvain Lebresne commented on CASSANDRA-7241: - That's not my territory any more, un-assigning myself. > Pig test fails on 2.1 branch > > > Key: CASSANDRA-7241 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7241 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Liu > Fix For: 2.1 rc1 > > Attachments: 0001-Revert-5417-changes-to-hadoop-stuff.txt, > repro-cli.txt, repro-pig.txt > > > run ant pig-test on cassandra-2.1 branch. There are many tests failed. I > trace it a little and find out Pig test fails starts from > https://github.com/apache/cassandra/commit/362cc05352ec67e707e0ac790732e96a15e63f6b > commit. > It looks like storage changes break Pig tests. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7241) Pig test fails on 2.1 branch
[ https://issues.apache.org/jira/browse/CASSANDRA-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011544#comment-14011544 ] Brandon Williams commented on CASSANDRA-7241: - So for the last problem, something is wrong in CqlStorage's split selection. If I disable vnodes, it works, but with them enabled I get no output at all. And indeed, if I use the same query in cqlsh nothing is returned, so the split is bogus. > Pig test fails on 2.1 branch > > > Key: CASSANDRA-7241 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7241 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Liu >Assignee: Sylvain Lebresne > Fix For: 2.1 rc1 > > Attachments: 0001-Revert-5417-changes-to-hadoop-stuff.txt, > repro-cli.txt, repro-pig.txt > > > run ant pig-test on cassandra-2.1 branch. There are many tests failed. I > trace it a little and find out Pig test fails starts from > https://github.com/apache/cassandra/commit/362cc05352ec67e707e0ac790732e96a15e63f6b > commit. > It looks like storage changes break Pig tests. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7241) Pig test fails on 2.1 branch
[ https://issues.apache.org/jira/browse/CASSANDRA-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011412#comment-14011412 ] Brandon Williams commented on CASSANDRA-7241: - This fixes about 90% of our problems, so I committed it. Still have one failure in ThriftColumnFamilyTest I'll take a look at. > Pig test fails on 2.1 branch > > > Key: CASSANDRA-7241 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7241 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Liu >Assignee: Sylvain Lebresne > Fix For: 2.1 rc1 > > Attachments: 0001-Revert-5417-changes-to-hadoop-stuff.txt, > repro-cli.txt, repro-pig.txt > > > run ant pig-test on cassandra-2.1 branch. There are many tests failed. I > trace it a little and find out Pig test fails starts from > https://github.com/apache/cassandra/commit/362cc05352ec67e707e0ac790732e96a15e63f6b > commit. > It looks like storage changes break Pig tests. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7241) Pig test fails on 2.1 branch
[ https://issues.apache.org/jira/browse/CASSANDRA-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010668#comment-14010668 ] Brandon Williams commented on CASSANDRA-7241: - That makes sense, but I think we have two problems then, since none of those methods appear in the stacktrace I recently posted. > Pig test fails on 2.1 branch > > > Key: CASSANDRA-7241 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7241 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Liu >Assignee: Sylvain Lebresne > Fix For: 2.1 rc1 > > Attachments: repro-cli.txt, repro-pig.txt > > > run ant pig-test on cassandra-2.1 branch. There are many tests failed. I > trace it a little and find out Pig test fails starts from > https://github.com/apache/cassandra/commit/362cc05352ec67e707e0ac790732e96a15e63f6b > commit. > It looks like storage changes break Pig tests. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7241) Pig test fails on 2.1 branch
[ https://issues.apache.org/jira/browse/CASSANDRA-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010621#comment-14010621 ] Alex Liu commented on CASSANDRA-7241: - I found the following place by peeking into the code {code} /** convert a cql column to an object */ private Object cqlColumnToObj(Cell col, CfDef cfDef) throws IOException { // standard Map validators = getValidatorMap(cfDef); ByteBuffer cellName = col.name().toByteBuffer(); if (validators.get(cellName) == null) return cassandraToObj(getDefaultMarshallers(cfDef).get(MarshallerType.DEFAULT_VALIDATOR), col.value()); else return cassandraToObj(validators.get(cellName), col.value()); } where ByteBuffer cellName = col.name().toByteBuffer() is not the right way to get the column name, It should use Cell type comparator. {code} > Pig test fails on 2.1 branch > > > Key: CASSANDRA-7241 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7241 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Liu >Assignee: Sylvain Lebresne > Fix For: 2.1 rc1 > > Attachments: repro-cli.txt, repro-pig.txt > > > run ant pig-test on cassandra-2.1 branch. There are many tests failed. I > trace it a little and find out Pig test fails starts from > https://github.com/apache/cassandra/commit/362cc05352ec67e707e0ac790732e96a15e63f6b > commit. > It looks like storage changes break Pig tests. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7241) Pig test fails on 2.1 branch
[ https://issues.apache.org/jira/browse/CASSANDRA-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010491#comment-14010491 ] Brandon Williams commented on CASSANDRA-7241: - Can you explain why Column vs Cell makes a difference? > Pig test fails on 2.1 branch > > > Key: CASSANDRA-7241 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7241 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Liu >Assignee: Sylvain Lebresne > Fix For: 2.1 rc1 > > Attachments: repro-cli.txt, repro-pig.txt > > > run ant pig-test on cassandra-2.1 branch. There are many tests failed. I > trace it a little and find out Pig test fails starts from > https://github.com/apache/cassandra/commit/362cc05352ec67e707e0ac790732e96a15e63f6b > commit. > It looks like storage changes break Pig tests. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7241) Pig test fails on 2.1 branch
[ https://issues.apache.org/jira/browse/CASSANDRA-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010485#comment-14010485 ] Alex Liu commented on CASSANDRA-7241: - If we roll back the cell changes in Pig/Hadoop, those errors should be gone. We can create new Hadoop Reader/InputFormat using cell implementation if we have to, for hadoop tasks are running in a separate process. I don't see using Cell instead of thrift Column really makes much difference > Pig test fails on 2.1 branch > > > Key: CASSANDRA-7241 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7241 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Liu >Assignee: Sylvain Lebresne > Fix For: 2.1 rc1 > > Attachments: repro-cli.txt, repro-pig.txt > > > run ant pig-test on cassandra-2.1 branch. There are many tests failed. I > trace it a little and find out Pig test fails starts from > https://github.com/apache/cassandra/commit/362cc05352ec67e707e0ac790732e96a15e63f6b > commit. > It looks like storage changes break Pig tests. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7241) Pig test fails on 2.1 branch
[ https://issues.apache.org/jira/browse/CASSANDRA-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010484#comment-14010484 ] Alex Liu commented on CASSANDRA-7241: - If we roll back the cell changes in Pig/Hadoop, those errors should be gone. We can create new Hadoop Reader/InputFormat using cell implementation if we have to, for hadoop tasks are running in a separate process. I don't see using Cell instead of thrift Column really makes much difference > Pig test fails on 2.1 branch > > > Key: CASSANDRA-7241 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7241 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Liu >Assignee: Sylvain Lebresne > Fix For: 2.1 rc1 > > Attachments: repro-cli.txt, repro-pig.txt > > > run ant pig-test on cassandra-2.1 branch. There are many tests failed. I > trace it a little and find out Pig test fails starts from > https://github.com/apache/cassandra/commit/362cc05352ec67e707e0ac790732e96a15e63f6b > commit. > It looks like storage changes break Pig tests. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7241) Pig test fails on 2.1 branch
[ https://issues.apache.org/jira/browse/CASSANDRA-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010299#comment-14010299 ] Brandon Williams commented on CASSANDRA-7241: - So, I took a different tack and ignored these errors, and instead focused on ThriftColumnFamilyDataTypeTest, which fails with: {noformat} [junit] Testcase: testCassandraStorageDataType(org.apache.cassandra.pig.ThriftColumnFamilyDataTypeTest): Caused an ERROR [junit] org.apache.pig.data.DefaultDataBag cannot be cast to org.apache.pig.data.Tuple [junit] java.lang.ClassCastException: org.apache.pig.data.DefaultDataBag cannot be cast to org.apache.pig.data.Tuple [junit] at org.apache.cassandra.pig.ThriftColumnFamilyDataTypeTest.testCassandraStorageDataType(ThriftColumnFamilyDataTypeTest.java:150) {noformat} After a lengthy, tricky, painful bisect, I land back at CASSANDRA-5417. This test fails 100% of the time, and given the error I don't see how it can possibly be a timing issue. So I recreated this test using the cli and pig so I could run it manually, and I get this: {noformat} org.apache.cassandra.serializers.MarshalException: Invalid UTF-8 bytes deadbeef at org.apache.cassandra.serializers.AbstractTextSerializer.deserialize(AbstractTextSerializer.java:43) at org.apache.cassandra.serializers.AbstractTextSerializer.deserialize(AbstractTextSerializer.java:26) at org.apache.cassandra.db.marshal.AbstractType.compose(AbstractType.java:142) at org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.columnToTuple(AbstractCassandraStorage.java:131) at org.apache.cassandra.hadoop.pig.CassandraStorage.getNext(CassandraStorage.java:256) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:194) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) {noformat} Which is interesting, since the deadbeef column is BytesType (verified in the cli), and the line in ACS that throws is also from CASSANDRA-5417. I'm left to conclude that, if the problem is in pig, it's still CASSANDRA-5417's fault :) I can attach the cli-ified script and very simple pig script to run against if needed. > Pig test fails on 2.1 branch > > > Key: CASSANDRA-7241 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7241 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Liu >Assignee: Brandon Williams > Fix For: 2.1 rc1 > > > run ant pig-test on cassandra-2.1 branch. There are many tests failed. I > trace it a little and find out Pig test fails starts from > https://github.com/apache/cassandra/commit/362cc05352ec67e707e0ac790732e96a15e63f6b > commit. > It looks like storage changes break Pig tests. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7241) Pig test fails on 2.1 branch
[ https://issues.apache.org/jira/browse/CASSANDRA-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14003413#comment-14003413 ] Sylvain Lebresne commented on CASSANDRA-7241: - If others run into that NPE in MiniDFSCluster.startDataNodes, the problem/solution is explained [here|http://stackoverflow.com/questions/17625938/hbase-minidfscluster-java-fails-in-certain-environments]. With that fixed, I do get the same testCqlStorageCompositeKeyTable failure. That said, when starting to look into what's going on, I added one simple log statement in ModificationStatement to log the applied mutation, and somehow that was enough to make the test pass some of the times (not every time, but often enough). So it's a timing issue, probably something that's not really related to CASSANDRA-5417 but was exposed by it because the timing changed. Where to go from there, I'm not sure. I have zero experience with Pig or Hadoop and our integration of it, and I have a hard time wrapping my head around the CqlStorage/AbstractCassandraStorage code tbh, so I'm not sure I'm the most qualified person to track this race down. But given that the pig test seems to be the only ones having problem of that sort, it suggests (though I can't be sure) that said race is more likely in either the pig code or the pig tests than anything else. So I'm going to unassigned myself for now because I just don't have the time and experience to dig into the pig layers. If someone more familiar with those is able to reduce the failure to something Cassandra specific, I'll gladly take a look at that. > Pig test fails on 2.1 branch > > > Key: CASSANDRA-7241 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7241 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Liu >Assignee: Sylvain Lebresne > > run ant pig-test on cassandra-2.1 branch. There are many tests failed. I > trace it a little and find out Pig test fails starts from > https://github.com/apache/cassandra/commit/362cc05352ec67e707e0ac790732e96a15e63f6b > commit. > It looks like storage changes break Pig tests. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7241) Pig test fails on 2.1 branch
[ https://issues.apache.org/jira/browse/CASSANDRA-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1452#comment-1452 ] Brandon Williams commented on CASSANDRA-7241: - I get a lot of errors, but actually not that one. No configuration is required for 'ant pig-test' so I'm not sure what happened there. I won't begin to pretend this is the only problem we have with pig on 2.1. That said, what I am concerned with here is: {noformat} [junit] Testcase: testCqlStorageCompositeKeyTable(org.apache.cassandra.pig.CqlTableTest): FAILED [junit] expected:<0> but was:<9> [junit] junit.framework.AssertionFailedError: expected:<0> but was:<9> [junit] at org.apache.cassandra.pig.CqlTableTest.testCqlStorageCompositeKeyTable(CqlTableTest.java:167) {noformat} Which if you attempt to bisect from 2.0 to 2.1, lands you at CASSANDRA-5417, but admittedly fails in even more tests there in different ways. Despite a load of warnings, more things pass on current 2.1. Are there any other tickets which might affect CASSANDRA-5417's behavior since then? > Pig test fails on 2.1 branch > > > Key: CASSANDRA-7241 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7241 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Liu >Assignee: Sylvain Lebresne > > run ant pig-test on cassandra-2.1 branch. There are many tests failed. I > trace it a little and find out Pig test fails starts from > https://github.com/apache/cassandra/commit/362cc05352ec67e707e0ac790732e96a15e63f6b > commit. > It looks like storage changes break Pig tests. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7241) Pig test fails on 2.1 branch
[ https://issues.apache.org/jira/browse/CASSANDRA-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13998980#comment-13998980 ] Brandon Williams commented on CASSANDRA-7241: - I can confirm that CASSANDRA-5417 is definitely the first commit where this broke, applying CASSANDRA-6877 and testing the commit before it. > Pig test fails on 2.1 branch > > > Key: CASSANDRA-7241 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7241 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Liu >Assignee: Sylvain Lebresne > > run ant pig-test on cassandra-2.1 branch. There are many tests failed. I > trace it a little and find out Pig test fails starts from > https://github.com/apache/cassandra/commit/362cc05352ec67e707e0ac790732e96a15e63f6b > commit. > It looks like storage changes break Pig tests. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7241) Pig test fails on 2.1 branch
[ https://issues.apache.org/jira/browse/CASSANDRA-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1457#comment-1457 ] Brandon Williams commented on CASSANDRA-7241: - Also worth noting that pig-test completes perfectly on 2.0 now. > Pig test fails on 2.1 branch > > > Key: CASSANDRA-7241 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7241 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Liu >Assignee: Sylvain Lebresne > > run ant pig-test on cassandra-2.1 branch. There are many tests failed. I > trace it a little and find out Pig test fails starts from > https://github.com/apache/cassandra/commit/362cc05352ec67e707e0ac790732e96a15e63f6b > commit. > It looks like storage changes break Pig tests. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7241) Pig test fails on 2.1 branch
[ https://issues.apache.org/jira/browse/CASSANDRA-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13999898#comment-13999898 ] Sylvain Lebresne commented on CASSANDRA-7241: - The output of those test is a bit of a mess, but basically I get both failing with: {noformat} [junit] java.lang.ExceptionInInitializerError [junit] at org.apache.cassandra.pig.PigTestBase.startHadoopCluster(PigTestBase.java:109) [junit] at org.apache.cassandra.pig.CqlTableDataTypeTest.setup(CqlTableDataTypeTest.java:216) [junit] Caused by: java.lang.NullPointerException [junit] at org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:422) [junit] at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:280) [junit] at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:124) [junit] at org.apache.pig.test.MiniCluster.setupMiniDfsAndMrClusters(MiniCluster.java:51) [junit] at org.apache.pig.test.MiniGenericCluster.(MiniGenericCluster.java:49) [junit] at org.apache.pig.test.MiniCluster.(MiniCluster.java:32) [junit] at org.apache.pig.test.MiniGenericCluster.(MiniGenericCluster.java:45) {noformat} but I doubt CASSANDRA-5417 is the source of that. Is there any configuration to do to get those tests working? > Pig test fails on 2.1 branch > > > Key: CASSANDRA-7241 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7241 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Liu >Assignee: Sylvain Lebresne > > run ant pig-test on cassandra-2.1 branch. There are many tests failed. I > trace it a little and find out Pig test fails starts from > https://github.com/apache/cassandra/commit/362cc05352ec67e707e0ac790732e96a15e63f6b > commit. > It looks like storage changes break Pig tests. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7241) Pig test fails on 2.1 branch
[ https://issues.apache.org/jira/browse/CASSANDRA-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13998291#comment-13998291 ] Brandon Williams commented on CASSANDRA-7241: - To be clear, I think CASSANDRA-5417 broke something, but it's only exposed by hadoop. Unfortunately testing outside of pig to confirm is complicated by CASSANDRA-7200. > Pig test fails on 2.1 branch > > > Key: CASSANDRA-7241 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7241 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Liu >Assignee: Sylvain Lebresne > > run ant pig-test on cassandra-2.1 branch. There are many tests failed. I > trace it a little and find out Pig test fails starts from > https://github.com/apache/cassandra/commit/362cc05352ec67e707e0ac790732e96a15e63f6b > commit. > It looks like storage changes break Pig tests. -- This message was sent by Atlassian JIRA (v6.2#6252)