[
https://issues.apache.org/jira/browse/BIGTOP-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14280681#comment-14280681
]
RJ Nowling commented on BIGTOP-1586:
------------------------------------
Comments on patch:
1. Import of SparkETL in ETLSuite causes an import warning in the build. I'm
not sure it's necessary since the test suite and SparkETL are in the same
package.
2. Customers, locations, products in ETLSuite were reordered, making it look
like a lot of changes when there isn't. Hard to see what was actually changed.
I also think you commented out a line which is a duplicate of a line a few
lines above (customers).
3. Not a fan of removing the Locations arrays -- you end up with duplications
which makes it harder to change things.
4. Why were the additional calendars removed?
5. We should add more tests for dateTime in the equality method in
TransactionProduct. Please check timezone, year, month, day, hour, minutes,
seconds. If any of those fail, then we can fix the date/time parsing code.
I'm assuming the failures may not be in those fields but it will help us debug
the code.
6. I think the approach used in the ParseRawData test is hard to read. Can it
be cleaned up? I think it would be better to define a series of methods like
compare(Transaction, Transaction), compare(Store, Store), compare(Calendar,
Calendar), etc. that call each other recursively. More verbose but easier to
understand. Or define equals() methods for each case class.
> BigPetStore-Spark only works on the East Coast .
> ------------------------------------------------
>
> Key: BIGTOP-1586
> URL: https://issues.apache.org/jira/browse/BIGTOP-1586
> Project: Bigtop
> Issue Type: Bug
> Components: blueprints
> Affects Versions: 0.9.0
> Reporter: jay vyas
> Attachments: BIGTOP-1586.patch, dirty.patch
>
>
> Yup, its true. i think :)
> When visiting my parents in *oklahoma* I found that the way bigpetstore-spark
> is set up, only people on the *right* coast can run the unit tests...
> something with the default time zone setup in java and the unit tests which
> test for set equivalence.
> {noformat}
> s
> Failed to match [Lscala.Tuple5;@7ac2933e and [Lscala.Tuple5;@7c510a68
> missing
> (Store(5,11553),Location(11553,Uniondale,NY),Customer(999,Cesareo,Lamplough,20152),Location(20152,Chantilly,VA),TransactionProduct(999,32,5,java.util.GregorianCalendar[time=1446530891000,areFieldsSet=true,areAllFieldsSet=false,lenient=true,zone=sun.util.calendar.ZoneInfo[id="America/Chicago",offset=-21600000,dstSavings=3600000,useDaylight=true,transitions=235,lastRule=java.util.SimpleTimeZone[id=America/Chicago,offset=-21600000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2015,MONTH=10,WEEK_OF_YEAR=45,WEEK_OF_MONTH=1,DAY_OF_MONTH=3,DAY_OF_YEAR=307,DAY_OF_WEEK=3,DAY_OF_WEEK_IN_MONTH=1,AM_PM=0,HOUR=0,HOUR_OF_DAY=0,MINUTE=8,SECOND=11,MILLISECOND=0,ZONE_OFFSET=-21600000,DST_OFFSET=0],category=dry
> dog food;brand=Happy Pup;flavor=Fish & Potato;size=30.0;per_unit_cost=2.67;))
> ... not found=
> (Store(5,11553),Location(11553,Uniondale,NY),Customer(999,Cesareo,Lamplough,20152),Location(20152,Chantilly,VA),TransactionProduct(999,32,5,java.util.GregorianCalendar[time=?,areFieldsSet=false,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="GMT",offset=0,dstSavings=0,useDaylight=false,transitions=0,lastRule=null],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2015,MONTH=9,WEEK_OF_YEAR=2,WEEK_OF_MONTH=2,DAY_OF_MONTH=12,DAY_OF_YEAR=5,DAY_OF_WEEK=2,DAY_OF_WEEK_IN_MONTH=1,AM_PM=0,HOUR=6,HOUR_OF_DAY=4,MINUTE=29,SECOND=46,MILLISECOND=0,ZONE_OFFSET=0,DST_OFFSET=0],category=dry
> dog food;brand=Happy Pup;flavor=Fish & Potato;size=30.0;per_unit_cost=2.67;))
> ... not found= (Store(1,98110),Location(98110,Bainbridge
> Islan,WA),Customer(999,Cesareo,Lamplough,20152),Location(20152,Chantilly,VA),TransactionProduct(999,31,1,java.util.GregorianCalendar[time=?,areFieldsSet=false,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="GMT",offset=0,dstSavings=0,useDaylight=false,transitions=0,lastRule=null],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2015,MONTH=10,WEEK_OF_YEAR=2,WEEK_OF_MONTH=2,DAY_OF_MONTH=3,DAY_OF_YEAR=5,DAY_OF_WEEK=2,DAY_OF_WEEK_IN_MONTH=1,AM_PM=0,HOUR=6,HOUR_OF_DAY=1,MINUTE=8,SECOND=11,MILLISECOND=0,ZONE_OFFSET=0,DST_OFFSET=0],category=poop
> bags;brand=Dog Days;color=Blue;size=60.0;per_unit_cost=0.21;))
> ... not found=
> (Store(6,66067),Location(66067,Ottawa,KS),Customer(999,Cesareo,Lamplough,20152),Location(20152,Chantilly,VA),TransactionProduct
> {noformat}
> I've got a patch coming in which has some more general improvements to it,
> and in also some removal of the monads in the unit tests to make it easier to
> debug/run.
> I'll submit it shortly once i fully fix the time zone issue.
> Even though i love the east coast, I have a dream that anyone, regardless of
> the coast they are on, race, religion, or creed, can use spark to generate
> petabytes of fake data !
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)