[jira] [Commented] (HADOOP-12547) Remove hadoop-pipes

2015-11-04 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14989763#comment-14989763
 ] 

Allen Wittenauer commented on HADOOP-12547:
---

bq. If we are going to keep this, I would like to see some unit tests, 
documentation, and actual maintenance.

This pretty much sums up my feelings towards libwebhdfs as well, yet we're 
keeping it around because someone, somewhere might have fixed the compilation 
issues (hint: it's still broken in this recent 2.6 release) and is actually 
using it.

I see no reason to remove pipes if we're not also removing libwebhdfs.  The 
reasons to remove are pretty much identical, but at least pipes a) has compiled 
for most of its lifetime and b) we have evidence that people are actually 
looking at it.  

So from where I stand, they either both go or both stay.

> Remove hadoop-pipes
> ---
>
> Key: HADOOP-12547
> URL: https://issues.apache.org/jira/browse/HADOOP-12547
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
>
> Development appears to have stopped on hadoop-pipes upstream for the last few 
> years, aside from very basic maintenance.  Hadoop streaming seems to be a 
> better alternative, since it supports more programming languages and is 
> better implemented.
> There were no responses to a message on the mailing list asking for users of 
> Hadoop pipes... and in my experience, I have never seen anyone use this.  We 
> should remove it to reduce our maintenance burden and build times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12547) Remove hadoop-pipes

2015-11-04 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14989803#comment-14989803
 ] 

Kihwal Lee commented on HADOOP-12547:
-

bq. My guess is no, given most of that team has left/was shipped over to 
Microsoft.
That was certainly the case at one point in the past (2.5 CEOs ago), but you 
all know nothing stays the same in this business for long.

I think we have internal users who depend on pipes. We will find out whether 
they can move on to streaming. If they can't because of any fundamental 
shortcomings of streaming, we will need to address those.

bq. I don't think we can remove in branch-2, but let's do this for trunk.
I think the source of confusion is the tile, "Remove hadoop-pipes" and setting 
the target to 2.8. I will change the title to "deprecate".

> Remove hadoop-pipes
> ---
>
> Key: HADOOP-12547
> URL: https://issues.apache.org/jira/browse/HADOOP-12547
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
>
> Development appears to have stopped on hadoop-pipes upstream for the last few 
> years, aside from very basic maintenance.  Hadoop streaming seems to be a 
> better alternative, since it supports more programming languages and is 
> better implemented.
> There were no responses to a message on the mailing list asking for users of 
> Hadoop pipes... and in my experience, I have never seen anyone use this.  We 
> should remove it to reduce our maintenance burden and build times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12547) Remove hadoop-pipes

2015-11-03 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988229#comment-14988229
 ] 

Colin Patrick McCabe commented on HADOOP-12547:
---

Thank you for the perspective, [~aw].  It's true that you have been around for 
longer than me.  However, it's also true that in about 4 years of supporting 
customer Hadoop deployments I have never, once, seen anyone use or ask about 
Hadoop Pipes.  We've gotten requests for some pretty obscure things-- like 
adding a feature or fixing a bug in fuse_dfs, supporting the old obsolete MR1 
framework, or even preparing native code patches for decades-old versions of 
AIX, even running Hadoop on JVMs that I'm convinced most people have never 
heard of.  But __never__ for pipes.

That stack overflow post looks like a newbie stumbling into Hadoop for the 
first time and trying to follow a tutorial from more than 5 years ago... and 
failing, because this stuff hasn't been maintained-- and won't be maintained in 
the future.  That's hardly a ringing endorsement of keeping this around.  
Anyway, nobody is proposing removing this from 2.6 or any branch-2 release... 
only from trunk.

bq. Pipes was written primarily for Yahoo!'s search team. It was provided as a 
way for C code to interface with MapReduce without requiring significant 
rewrites. It was definitely in use before I left Yahoo! but I haven't kept 
track of whether it is still being used. My guess is no, given most of that 
team has left/was shipped over to Microsoft.

[~daryn], [~kihwal], do you have any perspective on this?  Is there any reason 
to keep this around in trunk / branch-3.0?  If we are going to keep this, I 
would like to see some unit tests, documentation, and actual maintenance.

> Remove hadoop-pipes
> ---
>
> Key: HADOOP-12547
> URL: https://issues.apache.org/jira/browse/HADOOP-12547
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
>
> Development appears to have stopped on hadoop-pipes upstream for the last few 
> years, aside from very basic maintenance.  Hadoop streaming seems to be a 
> better alternative, since it supports more programming languages and is 
> better implemented.
> There were no responses to a message on the mailing list asking for users of 
> Hadoop pipes... and in my experience, I have never seen anyone use this.  We 
> should remove it to reduce our maintenance burden and build times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12547) Remove hadoop-pipes

2015-11-03 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988032#comment-14988032
 ] 

Allen Wittenauer commented on HADOOP-12547:
---

As someone who has been around a lot longer than Colin, let me fill in some 
blanks.

Pipes was written primarily for Yahoo!'s search team.  It was provided as a way 
for C code to interface with MapReduce without requiring significant rewrites.  
It was definitely in use before I left Yahoo! but I haven't kept track of 
whether it is still being used.  My guess is no, given most of that team has 
left/was shipped over to Microsoft.

Even so, there are definitely references out on the Internet in the last year 
to people using Pipes if one actually bothers to look for them. e.g., 
http://stackoverflow.com/questions/28573127/hadoop-pipes-wordcount-example-nullpointerexception-in-localjobrunner
 , which features a comment made about Hadoop 2.6 about 5 days ago.

> Remove hadoop-pipes
> ---
>
> Key: HADOOP-12547
> URL: https://issues.apache.org/jira/browse/HADOOP-12547
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
>
> Development appears to have stopped on hadoop-pipes upstream for the last few 
> years, aside from very basic maintenance.  Hadoop streaming seems to be a 
> better alternative, since it supports more programming languages and is 
> better implemented.
> There were no responses to a message on the mailing list asking for users of 
> Hadoop pipes... and in my experience, I have never seen anyone use this.  We 
> should remove it to reduce our maintenance burden and build times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12547) Remove hadoop-pipes

2015-11-03 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987959#comment-14987959
 ] 

Andrew Wang commented on HADOOP-12547:
--

I don't think we can remove in branch-2, but let's do this for trunk.

> Remove hadoop-pipes
> ---
>
> Key: HADOOP-12547
> URL: https://issues.apache.org/jira/browse/HADOOP-12547
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
>
> Development appears to have stopped on hadoop-pipes upstream for the last few 
> years, aside from very basic maintenance.  Hadoop streaming seems to be a 
> better alternative, since it supports more programming languages and is 
> better implemented.
> There were no responses to a message on the mailing list asking for users of 
> Hadoop pipes... and in my experience, I have never seen anyone use this.  We 
> should remove it to reduce our maintenance burden and build times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12547) Remove hadoop-pipes

2015-11-03 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988009#comment-14988009
 ] 

Chris Nauroth commented on HADOOP-12547:


HADOOP-12518 is a recent patch for the hadoop-pipes build, targeted to 3.0.0.  
That implies that someone might be interested in keeping it.  [~aw], would you 
please comment, since you filed HADOOP-12518?

I agree that if we proceed with removing it, it would have to be done in 
trunk/3.0.0 only on grounds of backward compatibility.

> Remove hadoop-pipes
> ---
>
> Key: HADOOP-12547
> URL: https://issues.apache.org/jira/browse/HADOOP-12547
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
>
> Development appears to have stopped on hadoop-pipes upstream for the last few 
> years, aside from very basic maintenance.  Hadoop streaming seems to be a 
> better alternative, since it supports more programming languages and is 
> better implemented.
> There were no responses to a message on the mailing list asking for users of 
> Hadoop pipes... and in my experience, I have never seen anyone use this.  We 
> should remove it to reduce our maintenance burden and build times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)