Okay, good to know. I'll be back in SF on Friday and will sit down w/some of my friends who know HBase better than I do and take another look.
J On Tue, Jan 29, 2013 at 9:12 AM, Micah Whitacre <[email protected]>wrote: > Unfortunately it doesn't look like this is just a test failure as > running against a CDH4.1.1 cluster fails in the exact same manner. > Here is a copy of the code I used[1] > > [1] - http://pastebin.com/QLEc5fmG > > On Tue, Jan 29, 2013 at 8:44 AM, Micah Whitacre <[email protected]> > wrote: > > The problem of reading from the same table twice seems interesting. > > At one point when trying to figure out the problem I tweaked the test > > to run the joinedTable through the same wordCount steps to make sure > > everything was read and then persisted correctly. So the flow of the > > test became: > > > > write to wordcount table > > wordcount > > write to join table > > wordcount the join table (output to a different table) > > attempt to join words with others. > > > > That flow would work as expected but still fail on the last join. So > > it seems like it would be reading in correctly from HBase. > > > > I am working on building a stand alone example and will report back > > the findings. > > > > thanks for your help, > > micah > > > > > > On Mon, Jan 28, 2013 at 11:55 PM, Josh Wills <[email protected]> > wrote: > >> I have to call it a night, but this is an odd one. > >> > >> The basic problem seems to be that we are reading from the same table > >> twice-- it seems like the HTable object is the same on both splits > (always > >> reading from the words table, or always reading from the joinTableName > >> table), but the Scan object appears to get updated. I verified this by > using > >> a different column family on the joinTableName table and seeing that the > >> test returned no output for the join, which is what we would expect if > one > >> of the reads had no input. > >> > >> Looking in the code, I don't see a place where the 0.92.1 and 0.90.4 > code > >> differ significantly in terms of the input format, record reader, etc. > I'm > >> on the road this week, but I'd like to work on this one some more when > I'm > >> back in SF and can sit down with my co-workers who know more HBase than > I > >> do. > >> > >> Out of curiousity-- is it just the unit test that fails, or can you run > a > >> real HBase MR job that suffers from this problem? > >> > >> J > >> > >> > >> On Mon, Jan 28, 2013 at 7:26 PM, Josh Wills <[email protected]> > wrote: > >>> > >>> Ack, sorry-- was checking email on my phone and didn't see the patch. I > >>> can replicate it locally, digging in now. > >>> > >>> > >>> On Mon, Jan 28, 2013 at 6:47 PM, Whitacre,Micah > >>> <[email protected]> wrote: > >>>> > >>>> The patch should contain the specifics but I've tested using 4.1.1, > >>>> 4.1.2, and 4.1.3. Each gives the same results. > >>>> > >>>> > >>>> > >>>> > >>>> On Jan 28, 2013, at 20:44, "Josh Wills" <[email protected]> wrote: > >>>> > >>>> I usually run them in Eclipse, but not using a particularly special > run > >>>> configuration (I think.) Let me see if I can replicate that one-- > which CDH > >>>> version? > >>>> > >>>> > >>>> On Mon, Jan 28, 2013 at 3:13 PM, Micah Whitacre <[email protected] > > > >>>> wrote: > >>>>> > >>>>> Related to this thread, where I asked how to save off the > intermediate > >>>>> state but in general how do you debug the project, specifically for > >>>>> the IT tests? Do you typically run through Eclipse with special > >>>>> profiles? > >>>>> > >>>>> I'm still trying to track down an odd failure in crunch-hbase when > >>>>> swapping out the dependencies to use CDH4.1.x. The test failure > seems > >>>>> to indicate the test is joining the same PCollection on itself. > >>>>> > >>>>> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 63.13 > >>>>> sec <<< FAILURE! > >>>>> testWordCount(org.apache.crunch.io.hbase.WordCountHBaseIT) Time > >>>>> elapsed: 62.789 sec <<< FAILURE! > >>>>> java.lang.AssertionError: expected:<[cat,zebra, cat,donkey, > dog,bird]> > >>>>> but was:<[bird,bird, zebra,zebra, horse,horse, donkey,donkey]> > >>>>> at org.junit.Assert.fail(Assert.java:93) > >>>>> at org.junit.Assert.failNotEquals(Assert.java:647) > >>>>> at org.junit.Assert.assertEquals(Assert.java:128) > >>>>> at org.junit.Assert.assertEquals(Assert.java:147) > >>>>> at > >>>>> > org.apache.crunch.io.hbase.WordCountHBaseIT.run(WordCountHBaseIT.java:257) > >>>>> at > >>>>> > org.apache.crunch.io.hbase.WordCountHBaseIT.testWordCount(WordCountHBaseIT.java:202) > >>>>> > >>>>> and sometimes: > >>>>> > >>>>> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: > 71.958 > >>>>> sec <<< FAILURE! > >>>>> testWordCount(org.apache.crunch.io.hbase.WordCountHBaseIT) Time > >>>>> elapsed: 71.469 sec <<< FAILURE! > >>>>> java.lang.AssertionError: expected:<[cat,zebra, cat,donkey, > dog,bird]> > >>>>> but was:<[dog,dog, cat,cat]> > >>>>> at org.junit.Assert.fail(Assert.java:93) > >>>>> at org.junit.Assert.failNotEquals(Assert.java:647) > >>>>> at org.junit.Assert.assertEquals(Assert.java:128) > >>>>> at org.junit.Assert.assertEquals(Assert.java:147) > >>>>> at > >>>>> > org.apache.crunch.io.hbase.WordCountHBaseIT.run(WordCountHBaseIT.java:259) > >>>>> at > >>>>> > org.apache.crunch.io.hbase.WordCountHBaseIT.testWordCount(WordCountHBaseIT.java:202) > >>>>> > >>>>> Most likely due to the same reason Crunch requires a special build of > >>>>> HBase 0.94.1, I've found I need to mix and match CDH4 versions as > >>>>> shown by the attached patch. For the Crunch core build I need to use > >>>>> all of the latest 2.0.0 code but for testing crunch-hbase I need to > >>>>> use the mrv1 fork for hadoop-core and hadoop-minicluster. I wouldn't > >>>>> think that either of those would affect the tests unless somehow the > >>>>> files used for the intermediate states were not being temporarily > >>>>> stored correctly. The fact that the test fails differently does make > >>>>> me wonder about a concurrency issue but I'm not sure where. > >>>>> > >>>>> Any pointers on debugging would be helpful. > >>>>> Micah > >>>>> > >>>>> On Thu, Jan 24, 2013 at 2:24 PM, Micah Whitacre < > [email protected]> > >>>>> wrote: > >>>>> > I am creating an entirely new profile simply to keep my changes > >>>>> > separate from what is in apache/master. > >>>>> > > >>>>> > Thanks for the hint about the "naive" approach. Previously I had > the > >>>>> > following: > >>>>> > > >>>>> > <hadoop.version>2.0.0-cdh4.1.1</hadoop.version> > >>>>> > > >>>>> > <hadoop.client.version>2.0.0-mr1-cdh4.1.1</hadoop.client.version> > >>>>> > <hbase.version>0.92.1-cdh4.1.1</hbase.version> > >>>>> > > >>>>> > If I follow what you did and change it to: > >>>>> > > >>>>> > <hadoop.version>2.0.0-cdh4.1.1</hadoop.version> > >>>>> > > >>>>> > <hadoop.client.version>2.0.0-cdh4.1.1</hadoop.client.version> > >>>>> > <hbase.version>0.92.1-cdh4.1.1</hbase.version> > >>>>> > > >>>>> > The build gets farther. I now have a different failure in > >>>>> > crunch-hbase I'll start working on. > >>>>> > > >>>>> > Thanks for your help. > >>>>> > Micah > >>>>> > > >>>>> > > >>>>> > On Thu, Jan 24, 2013 at 12:23 PM, Josh Wills <[email protected]> > >>>>> > wrote: > >>>>> >> Micah, > >>>>> >> > >>>>> >> I did the naive thing and just swapped in 2.0.0-cdh4.1.2 for > >>>>> >> 2.0.0-alpha in > >>>>> >> the crunch.platform=2 profile in the top level POM and then added > in > >>>>> >> the > >>>>> >> Cloudera repositories. That works for me-- does it work for you? > It > >>>>> >> sounds > >>>>> >> to me like you're creating an entirely new profile. > >>>>> >> > >>>>> >> J > >>>>> >> > >>>>> >> > >>>>> >> On Thu, Jan 24, 2013 at 7:58 AM, Micah Whitacre > >>>>> >> <[email protected]> > >>>>> >> wrote: > >>>>> >>> > >>>>> >>> running dependency:tree on both projects shows that the version > of > >>>>> >>> Avro is 1.7.0 for running under both profiles. I wish it was > that > >>>>> >>> easy. :) > >>>>> >>> > >>>>> >>> On Thu, Jan 24, 2013 at 9:53 AM, Josh Wills <[email protected] > > > >>>>> >>> wrote: > >>>>> >>> > > >>>>> >>> > > >>>>> >>> > > >>>>> >>> > On Thu, Jan 24, 2013 at 6:40 AM, Micah Whitacre > >>>>> >>> > <[email protected]> > >>>>> >>> > wrote: > >>>>> >>> >> > >>>>> >>> >> Taking a step back and comparing what is being generated for a > >>>>> >>> >> normal > >>>>> >>> >> successful test run of "-Dcrunch.platform=2" I do see a p1 > and p2 > >>>>> >>> >> directory being created, with the expected materialized output > >>>>> >>> >> being > >>>>> >>> >> in the p1 directory. So I'm still curious about tracking all > of > >>>>> >>> >> the > >>>>> >>> >> intermediate state but it doesn't look like it is an issue > with > >>>>> >>> >> regard > >>>>> >>> >> to creating the output in the wrong directory. > >>>>> >>> > > >>>>> >>> > > >>>>> >>> > That's a relief. :) > >>>>> >>> > > >>>>> >>> > I think the issue with temp outputs has to do with our use of > the > >>>>> >>> > TemporaryPath libraries for creating, well, temporary paths. > We do > >>>>> >>> > this > >>>>> >>> > so > >>>>> >>> > we play nicely with CI frameworks, but you might need to > disable > >>>>> >>> > it for > >>>>> >>> > investigating intermediate outputs. > >>>>> >>> > > >>>>> >>> > Re: the specific error you're seeing, that looks interesting. I > >>>>> >>> > wonder > >>>>> >>> > if > >>>>> >>> > it's an Avro version change or some such thing. Will see if I > can > >>>>> >>> > replicate > >>>>> >>> > it. > >>>>> >>> > > >>>>> >>> > > >>>>> >>> > -- > >>>>> >>> > Director of Data Science > >>>>> >>> > Cloudera > >>>>> >>> > Twitter: @josh_wills > >>>>> >> > >>>>> >> > >>>>> >> > >>>>> >> > >>>>> >> -- > >>>>> >> Director of Data Science > >>>>> >> Cloudera > >>>>> >> Twitter: @josh_wills > >>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> Director of Data Science > >>>> Cloudera > >>>> Twitter: @josh_wills > >>>> > >>>> CONFIDENTIALITY NOTICE This message and any included attachments are > from > >>>> Cerner Corporation and are intended only for the addressee. The > information > >>>> contained in this message is confidential and may constitute inside or > >>>> non-public information under international, federal, or state > securities > >>>> laws. Unauthorized forwarding, printing, copying, distribution, or > use of > >>>> such information is strictly prohibited and may be unlawful. If you > are not > >>>> the addressee, please promptly delete this message and notify the > sender of > >>>> the delivery error by e-mail or you may call Cerner's corporate > offices in > >>>> Kansas City, Missouri, U.S.A at (+1) (816)221-1024. > >>> > >>> > >>> > >>> > >>> -- > >>> Director of Data Science > >>> Cloudera > >>> Twitter: @josh_wills > >> > >> > >> > >> > >> -- > >> Director of Data Science > >> Cloudera > >> Twitter: @josh_wills > -- Director of Data Science Cloudera <http://www.cloudera.com> Twitter: @josh_wills <http://twitter.com/josh_wills>
