RE: Creating a branch for Pig on Spark (PIG-4059)

2014-08-25 Thread Zhang, Liyun
Add me, I work on Pig on spark.

-Original Message-
From: Cheolsoo Park [mailto:piaozhe...@gmail.com] 
Sent: Tuesday, August 26, 2014 10:57 AM
To: Jarek Jarcec Cecho
Cc: dev@pig.apache.org; Praveen R
Subject: Re: Creating a branch for Pig on Spark (PIG-4059)

Hi guys,

I asked about branch committership to the infra mailing list, and here is the 
reply-

Many projects have what they consider 'partial committers' that is folks who 
have access to specific parts of a projects svn tree. Some projects do this for 
GSoC participants, others as a mechanism for moving to 'full committership' 
within the project.

Do note though that in the eyes of the ASF someone with an ICLA and an account 
with any permissions to commit code anywhere in the public svn tree is a 
committer. IOW, you would vote, have ICLAs filed, and request account creation 
as per normal, and then merely adjust the karma in asf-authorization-template 
(and or LDAP)


Looks like we need to vote and follow the normal process just like any other 
new committer.

@Praveen, Jacec,
I think Mayur and Praveen from Sigmoid Analytics need branch committership.
Will anyone else work on Pig-on-Spark? Please reply.

Once I have a full list of people, I will open a vote for Pig PMCs.

Thanks,
Cheolsoo


On Mon, Aug 25, 2014 at 11:51 AM, Cheolsoo Park 
wrote:

> Additionally, I will give "branch-specific" commit permission to 
> people who will work on Pig on Spark (assuming it is possible).
>
> Please let me know if you have any objection on this too.
>
>
> On Mon, Aug 25, 2014 at 10:25 AM, Jarek Jarcec Cecho 
> 
> wrote:
>
>> No objections from my side, thank you for creating the branch 
>> Cheolsoo and kudos to the Sigmoid Analytics team for the great work!
>>
>> Jarcec
>>
>> On Aug 25, 2014, at 7:14 PM, Cheolsoo Park  wrote:
>>
>> > Hi devs,
>> >
>> > Sigmoid Analytics has been working on Pig-on-Spark (PIG-4059), and 
>> > they
>> want to merge their work into Apache.
>> >
>> > I am going to create a "Spark" branch for them. Please let me know 
>> > if
>> you have any concerns.
>> >
>> > Thanks,
>> > Cheolsoo
>>
>>
>


Re: Creating a branch for Pig on Spark (PIG-4059)

2014-08-25 Thread Cheolsoo Park
Hi guys,

I asked about branch committership to the infra mailing list, and here is
the reply-

Many projects have what they consider 'partial committers' that is
folks who have access to specific parts of a projects svn tree. Some
projects do this for GSoC participants, others as a mechanism for
moving to 'full committership' within the project.

Do note though that in the eyes of the ASF someone with an ICLA and an
account with any permissions to commit code anywhere in the public svn
tree is a committer. IOW, you would vote, have ICLAs filed, and
request account creation as per normal, and then merely adjust the
karma in asf-authorization-template (and or LDAP)


Looks like we need to vote and follow the normal process just like any
other new committer.

@Praveen, Jacec,
I think Mayur and Praveen from Sigmoid Analytics need branch committership.
Will anyone else work on Pig-on-Spark? Please reply.

Once I have a full list of people, I will open a vote for Pig PMCs.

Thanks,
Cheolsoo


On Mon, Aug 25, 2014 at 11:51 AM, Cheolsoo Park 
wrote:

> Additionally, I will give "branch-specific" commit permission to people
> who will work on Pig on Spark (assuming it is possible).
>
> Please let me know if you have any objection on this too.
>
>
> On Mon, Aug 25, 2014 at 10:25 AM, Jarek Jarcec Cecho 
> wrote:
>
>> No objections from my side, thank you for creating the branch Cheolsoo
>> and kudos to the Sigmoid Analytics team for the great work!
>>
>> Jarcec
>>
>> On Aug 25, 2014, at 7:14 PM, Cheolsoo Park  wrote:
>>
>> > Hi devs,
>> >
>> > Sigmoid Analytics has been working on Pig-on-Spark (PIG-4059), and they
>> want to merge their work into Apache.
>> >
>> > I am going to create a "Spark" branch for them. Please let me know if
>> you have any concerns.
>> >
>> > Thanks,
>> > Cheolsoo
>>
>>
>


[jira] Subscription: PIG patch available

2014-08-25 Thread jira
Issue Subscription
Filter: PIG patch available (15 issues)

Subscriber: pigdaily

Key Summary
PIG-4111Make Pig compiles with avro-1.7.7
https://issues.apache.org/jira/browse/PIG-4111
PIG-4103Fix TestRegisteredJarVisibility(after PIG-4083)
https://issues.apache.org/jira/browse/PIG-4103
PIG-4066An optimization for ROLLUP operation in Pig
https://issues.apache.org/jira/browse/PIG-4066
PIG-4004Upgrade the Pigmix queries from the (old) mapred API to mapreduce
https://issues.apache.org/jira/browse/PIG-4004
PIG-4002Disable combiner when map-side aggregation is used
https://issues.apache.org/jira/browse/PIG-4002
PIG-3952PigStorage accepts '-tagSplit' to return full split information
https://issues.apache.org/jira/browse/PIG-3952
PIG-3911Define unique fields with @OutputSchema
https://issues.apache.org/jira/browse/PIG-3911
PIG-3877Getting Geo Latitude/Longitude from Address Lines
https://issues.apache.org/jira/browse/PIG-3877
PIG-3873Geo distance calculation using Haversine
https://issues.apache.org/jira/browse/PIG-3873
PIG-3866Create ThreadLocal classloader per PigContext
https://issues.apache.org/jira/browse/PIG-3866
PIG-3861duplicate jars get added to distributed cache
https://issues.apache.org/jira/browse/PIG-3861
PIG-3668COR built-in function when atleast one of the coefficient values is 
NaN
https://issues.apache.org/jira/browse/PIG-3668
PIG-3635Fix e2e tests for Hadoop 2.X on Windows
https://issues.apache.org/jira/browse/PIG-3635
PIG-3587add functionality for rolling over dates
https://issues.apache.org/jira/browse/PIG-3587
PIG-3441Allow Pig to use default resources from Configuration objects
https://issues.apache.org/jira/browse/PIG-3441

You may edit this subscription at:
https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=13225&filterId=12322384


Re: Review Request 24789: New logical optimizer rule: ConstantCalculator

2014-08-25 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24789/#review51430
---



trunk/src/org/apache/pig/builtin/CurrentTime.java


if the optimization is disabled, don't we want to go to old behavior of 
using pig.job.submitted ?



trunk/src/org/apache/pig/newplan/logical/rules/ConstantCalculator.java


There is no processedOperators.add happening. Is this variable needed ?



trunk/src/org/apache/pig/newplan/logical/rules/ConstantCalculator.java


does it make sense to do this setPlan in moveTree call itself?


- Thejas Nair


On Aug. 19, 2014, 5:41 p.m., Daniel Dai wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24789/
> ---
> 
> (Updated Aug. 19, 2014, 5:41 p.m.)
> 
> 
> Review request for pig.
> 
> 
> Repository: pig
> 
> 
> Description
> ---
> 
> See PIG-4128
> 
> 
> Diffs
> -
> 
>   trunk/src/org/apache/pig/EvalFunc.java 1618727 
>   trunk/src/org/apache/pig/Main.java 1618727 
>   trunk/src/org/apache/pig/builtin/ABS.java 1618727 
>   trunk/src/org/apache/pig/builtin/ARITY.java 1618727 
>   trunk/src/org/apache/pig/builtin/AddDuration.java 1618727 
>   trunk/src/org/apache/pig/builtin/Assert.java 1618727 
>   trunk/src/org/apache/pig/builtin/BagSize.java 1618727 
>   trunk/src/org/apache/pig/builtin/BagToString.java 1618727 
>   trunk/src/org/apache/pig/builtin/BagToTuple.java 1618727 
>   trunk/src/org/apache/pig/builtin/Base.java 1618727 
>   trunk/src/org/apache/pig/builtin/BigDecimalAbs.java 1618727 
>   trunk/src/org/apache/pig/builtin/BigIntegerAbs.java 1618727 
>   trunk/src/org/apache/pig/builtin/CONCAT.java 1618727 
>   trunk/src/org/apache/pig/builtin/ConstantSize.java 1618727 
>   trunk/src/org/apache/pig/builtin/CubeDimensions.java 1618727 
>   trunk/src/org/apache/pig/builtin/CurrentTime.java 1618727 
>   trunk/src/org/apache/pig/builtin/DIFF.java 1618727 
>   trunk/src/org/apache/pig/builtin/DaysBetween.java 1618727 
>   trunk/src/org/apache/pig/builtin/DoubleRound.java 1618727 
>   trunk/src/org/apache/pig/builtin/DoubleRoundTo.java 1618727 
>   trunk/src/org/apache/pig/builtin/ENDSWITH.java 1618727 
>   trunk/src/org/apache/pig/builtin/EqualsIgnoreCase.java 1618727 
>   trunk/src/org/apache/pig/builtin/FloatAbs.java 1618727 
>   trunk/src/org/apache/pig/builtin/FloatRound.java 1618727 
>   trunk/src/org/apache/pig/builtin/FloatRoundTo.java 1618727 
>   trunk/src/org/apache/pig/builtin/GetDay.java 1618727 
>   trunk/src/org/apache/pig/builtin/GetHour.java 1618727 
>   trunk/src/org/apache/pig/builtin/GetMilliSecond.java 1618727 
>   trunk/src/org/apache/pig/builtin/GetMinute.java 1618727 
>   trunk/src/org/apache/pig/builtin/GetMonth.java 1618727 
>   trunk/src/org/apache/pig/builtin/GetSecond.java 1618727 
>   trunk/src/org/apache/pig/builtin/GetWeek.java 1618727 
>   trunk/src/org/apache/pig/builtin/GetWeekYear.java 1618727 
>   trunk/src/org/apache/pig/builtin/GetYear.java 1618727 
>   trunk/src/org/apache/pig/builtin/HoursBetween.java 1618727 
>   trunk/src/org/apache/pig/builtin/INDEXOF.java 1618727 
>   trunk/src/org/apache/pig/builtin/INVERSEMAP.java 1618727 
>   trunk/src/org/apache/pig/builtin/IntAbs.java 1618727 
>   trunk/src/org/apache/pig/builtin/IsEmpty.java 1618727 
>   trunk/src/org/apache/pig/builtin/KEYSET.java 1618727 
>   trunk/src/org/apache/pig/builtin/LAST_INDEX_OF.java 1618727 
>   trunk/src/org/apache/pig/builtin/LCFIRST.java 1618727 
>   trunk/src/org/apache/pig/builtin/LOWER.java 1618727 
>   trunk/src/org/apache/pig/builtin/LTRIM.java 1618727 
>   trunk/src/org/apache/pig/builtin/LongAbs.java 1618727 
>   trunk/src/org/apache/pig/builtin/MapSize.java 1618727 
>   trunk/src/org/apache/pig/builtin/MilliSecondsBetween.java 1618727 
>   trunk/src/org/apache/pig/builtin/MinutesBetween.java 1618727 
>   trunk/src/org/apache/pig/builtin/MonthsBetween.java 1618727 
>   trunk/src/org/apache/pig/builtin/PluckTuple.java 1618727 
>   trunk/src/org/apache/pig/builtin/REGEX_EXTRACT.java 1618727 
>   trunk/src/org/apache/pig/builtin/REGEX_EXTRACT_ALL.java 1618727 
>   trunk/src/org/apache/pig/builtin/REPLACE.java 1618727 
>   trunk/src/org/apache/pig/builtin/ROUND.java 1618727 
>   trunk/src/org/apache/pig/builtin/ROUND_TO.java 1618727 
>   trunk/src/org/apache/pig/builtin/RTRIM.java 1618727 
>   trunk/src/org/apache/pig/builtin/RollupDimensions.java 1618727 
>   trunk/src/org/apache/pig/builtin/SIZE.java 1618727 
>   trunk/src/org/apache/pig/builtin/SPRINTF.java 1618727 
>   trunk/src/org/apache/pig/builtin/STARTSWITH.java 1618727 
>   trunk/src/org/apache/pig/builtin/STRSPLIT.java 16

Re: Creating a branch for Pig on Spark (PIG-4059)

2014-08-25 Thread Cheolsoo Park
Additionally, I will give "branch-specific" commit permission to people who
will work on Pig on Spark (assuming it is possible).

Please let me know if you have any objection on this too.


On Mon, Aug 25, 2014 at 10:25 AM, Jarek Jarcec Cecho 
wrote:

> No objections from my side, thank you for creating the branch Cheolsoo and
> kudos to the Sigmoid Analytics team for the great work!
>
> Jarcec
>
> On Aug 25, 2014, at 7:14 PM, Cheolsoo Park  wrote:
>
> > Hi devs,
> >
> > Sigmoid Analytics has been working on Pig-on-Spark (PIG-4059), and they
> want to merge their work into Apache.
> >
> > I am going to create a "Spark" branch for them. Please let me know if
> you have any concerns.
> >
> > Thanks,
> > Cheolsoo
>
>


[jira] [Commented] (PIG-2175) Switch Pig wiki to use confluence

2014-08-25 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14109444#comment-14109444
 ] 

Daniel Dai commented on PIG-2175:
-

Done

> Switch Pig wiki to use confluence
> -
>
> Key: PIG-2175
> URL: https://issues.apache.org/jira/browse/PIG-2175
> Project: Pig
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: PIG-2175-1.patch
>
>
> Confluence gives us more functionality and more permission control features. 
> We plan to migrate our wiki to confluence. I migrated part of our wiki to 
> https://cwiki.apache.org/confluence/display/PIG. I also put a link to the old 
> wiki on that site. Attached patch change links on Pig main site.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-4059) Pig on Spark

2014-08-25 Thread Gregory Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14109333#comment-14109333
 ] 

Gregory Owen commented on PIG-4059:
---

Deleted the link to the benchmark wish list. Will use Pigmix (possibly modified 
to get around features Catalyst Pig-on-Spark doesn't support yet) instead.

> Pig on Spark
> 
>
> Key: PIG-4059
> URL: https://issues.apache.org/jira/browse/PIG-4059
> Project: Pig
>  Issue Type: New Feature
>Reporter: Rohini Palaniswamy
>Assignee: Praveen Rachabattuni
> Attachments: Pig-on-Spark-Design-Doc.pdf
>
>
>There is lot of interest in adding Spark as a backend execution engine for 
> Pig. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Creating a branch for Pig on Spark (PIG-4059)

2014-08-25 Thread Jarek Jarcec Cecho
No objections from my side, thank you for creating the branch Cheolsoo and 
kudos to the Sigmoid Analytics team for the great work!

Jarcec

On Aug 25, 2014, at 7:14 PM, Cheolsoo Park  wrote:

> Hi devs,
> 
> Sigmoid Analytics has been working on Pig-on-Spark (PIG-4059), and they want 
> to merge their work into Apache.
> 
> I am going to create a "Spark" branch for them. Please let me know if you 
> have any concerns.
> 
> Thanks,
> Cheolsoo



[jira] [Commented] (PIG-4059) Pig on Spark

2014-08-25 Thread Cheolsoo Park (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14109330#comment-14109330
 ] 

Cheolsoo Park commented on PIG-4059:


[~praveenr019], sure. The only thing is that trunk has been quite unstable 
recently due to Tez API changes. But they've settled down for 0.5.0 release 
now. (RC is on vote.) As soon as we can pin Tez to a stable release, I will cut 
a branch for you. I also sent out an email to dev mailing list.

> Pig on Spark
> 
>
> Key: PIG-4059
> URL: https://issues.apache.org/jira/browse/PIG-4059
> Project: Pig
>  Issue Type: New Feature
>Reporter: Rohini Palaniswamy
>Assignee: Praveen Rachabattuni
> Attachments: Pig-on-Spark-Design-Doc.pdf
>
>
>There is lot of interest in adding Spark as a backend execution engine for 
> Pig. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Creating a branch for Pig on Spark (PIG-4059)

2014-08-25 Thread Cheolsoo Park
Hi devs,

Sigmoid Analytics has been working on Pig-on-Spark (PIG-4059
), and they want to merge
their work into Apache.

I am going to create a "Spark" branch for them. Please let me know if you
have any concerns.

Thanks,
Cheolsoo


[jira] [Commented] (PIG-4134) TEZ-1449 broke the build

2014-08-25 Thread Koji Noguchi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14109157#comment-14109157
 ] 

Koji Noguchi commented on PIG-4134:
---

bq. After my patch, trunk successfully gets compiled but I'm seeing about 60 
more e2e-tez failures.  Not sure if this is due to my patch or something else 
that changed on the tez side.

This was due to change on Tez side.  [~daijy] fixed it in PIG-4140.  Thanks 
Daniel!
About 60 e2e tests fixed at once :)


> TEZ-1449 broke the build
> 
>
> Key: PIG-4134
> URL: https://issues.apache.org/jira/browse/PIG-4134
> Project: Pig
>  Issue Type: Bug
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Trivial
> Fix For: 0.14.0
>
> Attachments: pig-4134-v01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-4059) Pig on Spark

2014-08-25 Thread Praveen Rachabattuni (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14109121#comment-14109121
 ] 

Praveen Rachabattuni commented on PIG-4059:
---

[~cheolsoo] Could you please create a branch for spark, so we can commit pig on 
spark (migrated to 
pig-13[https://github.com/sigmoidanalytics/spork/tree/spork-pig-13]) there.

> Pig on Spark
> 
>
> Key: PIG-4059
> URL: https://issues.apache.org/jira/browse/PIG-4059
> Project: Pig
>  Issue Type: New Feature
>Reporter: Rohini Palaniswamy
>Assignee: Praveen Rachabattuni
> Attachments: Pig-on-Spark-Design-Doc.pdf
>
>
>There is lot of interest in adding Spark as a backend execution engine for 
> Pig. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)