[jira] [Closed] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X

2015-09-17 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney closed NUTCH-1936.
---
Resolution: Fixed

> GSoC 2015 - Move Nutch to Hadoop 2.X
> 
>
> Key: NUTCH-1936
> URL: https://issues.apache.org/jira/browse/NUTCH-1936
> Project: Nutch
>  Issue Type: Task
>  Components: build
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>  Labels: gsoc2015
> Fix For: 1.11, 2.3.1
>
> Attachments: NUTCH-1939.patch
>
>
> The Nutch PMC 
> [discussed|http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] 
> ideas for a good 2015 GSoC project. It appears that porting the (trunk) 
> codebase to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an 
> attractive option and one which would present an excellent learning 
> experience for a summer student.
> A more comprehensive description of this issue should be included within 
> either a mentor-defined project description or a successful student 
> application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X

2015-09-17 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reopened NUTCH-1936:
-

> GSoC 2015 - Move Nutch to Hadoop 2.X
> 
>
> Key: NUTCH-1936
> URL: https://issues.apache.org/jira/browse/NUTCH-1936
> Project: Nutch
>  Issue Type: Task
>  Components: build
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>  Labels: gsoc2015
> Fix For: 1.11, 2.3.1
>
> Attachments: NUTCH-1939.patch
>
>
> The Nutch PMC 
> [discussed|http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] 
> ideas for a good 2015 GSoC project. It appears that porting the (trunk) 
> codebase to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an 
> attractive option and one which would present an excellent learning 
> experience for a summer student.
> A more comprehensive description of this issue should be included within 
> either a mentor-defined project description or a successful student 
> application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X

2015-09-17 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated NUTCH-1936:

Fix Version/s: (was: 2.4)
   2.3.1

> GSoC 2015 - Move Nutch to Hadoop 2.X
> 
>
> Key: NUTCH-1936
> URL: https://issues.apache.org/jira/browse/NUTCH-1936
> Project: Nutch
>  Issue Type: Task
>  Components: build
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>  Labels: gsoc2015
> Fix For: 1.11, 2.3.1
>
> Attachments: NUTCH-1939.patch
>
>
> The Nutch PMC 
> [discussed|http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] 
> ideas for a good 2015 GSoC project. It appears that porting the (trunk) 
> codebase to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an 
> attractive option and one which would present an excellent learning 
> experience for a summer student.
> A more comprehensive description of this issue should be included within 
> either a mentor-defined project description or a successful student 
> application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X

2015-08-21 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706883#comment-14706883
 ] 

Chris A. Mattmann commented on NUTCH-1936:
--

+1

 GSoC 2015 - Move Nutch to Hadoop 2.X
 

 Key: NUTCH-1936
 URL: https://issues.apache.org/jira/browse/NUTCH-1936
 Project: Nutch
  Issue Type: Task
  Components: build
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
  Labels: gsoc2015
 Fix For: 2.4, 1.11

 Attachments: NUTCH-1939.patch


 The Nutch PMC 
 [discussed|http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] 
 ideas for a good 2015 GSoC project. It appears that porting the (trunk) 
 codebase to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an 
 attractive option and one which would present an excellent learning 
 experience for a summer student.
 A more comprehensive description of this issue should be included within 
 either a mentor-defined project description or a successful student 
 application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X

2015-07-24 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated NUTCH-1936:

Attachment: NUTCH-1939.patch

Prelim patch which folks can try out.
N.B. tests fail with IOException RE: failure to load specific mapred-site.xml 
properties.
I am not sure that all API upgrades are done 100% properly however this is an 
effort for us to upgrade to 2.X.
I need to admit, I've pegged dependencies at 2.4.0 simply because this is what 
EMR uses... and right now we are using EMR for crawls. This is nothing bias 
from me, it is merely my observation that both client and server should be 
using the same. I understand that this is not adequate for everyone.
[~mjoyce]

 GSoC 2015 - Move Nutch to Hadoop 2.X
 

 Key: NUTCH-1936
 URL: https://issues.apache.org/jira/browse/NUTCH-1936
 Project: Nutch
  Issue Type: Task
  Components: build
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
  Labels: gsoc2015
 Fix For: 2.4, 1.11

 Attachments: NUTCH-1939.patch


 The Nutch PMC 
 [discussed|http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] 
 ideas for a good 2015 GSoC project. It appears that porting the (trunk) 
 codebase to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an 
 attractive option and one which would present an excellent learning 
 experience for a summer student.
 A more comprehensive description of this issue should be included within 
 either a mentor-defined project description or a successful student 
 application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X

2015-07-24 Thread Michael Joyce (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14641210#comment-14641210
 ] 

Michael Joyce commented on NUTCH-1936:
--

Ah this is absolutely awesome Lewis. Great job on this.

 GSoC 2015 - Move Nutch to Hadoop 2.X
 

 Key: NUTCH-1936
 URL: https://issues.apache.org/jira/browse/NUTCH-1936
 Project: Nutch
  Issue Type: Task
  Components: build
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
  Labels: gsoc2015
 Fix For: 2.4, 1.11

 Attachments: NUTCH-1939.patch


 The Nutch PMC 
 [discussed|http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] 
 ideas for a good 2015 GSoC project. It appears that porting the (trunk) 
 codebase to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an 
 attractive option and one which would present an excellent learning 
 experience for a summer student.
 A more comprehensive description of this issue should be included within 
 either a mentor-defined project description or a successful student 
 application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X

2015-07-24 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14641230#comment-14641230
 ] 

Lewis John McGibbney commented on NUTCH-1936:
-

One other note, this DOES NOT ofcourse upgrade app .mapred. API's to mapreduce. 
It odes however remove all deprecation for Hadoop oriented code.

 GSoC 2015 - Move Nutch to Hadoop 2.X
 

 Key: NUTCH-1936
 URL: https://issues.apache.org/jira/browse/NUTCH-1936
 Project: Nutch
  Issue Type: Task
  Components: build
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
  Labels: gsoc2015
 Fix For: 2.4, 1.11

 Attachments: NUTCH-1939.patch


 The Nutch PMC 
 [discussed|http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] 
 ideas for a good 2015 GSoC project. It appears that porting the (trunk) 
 codebase to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an 
 attractive option and one which would present an excellent learning 
 experience for a summer student.
 A more comprehensive description of this issue should be included within 
 either a mentor-defined project description or a successful student 
 application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X

2015-07-24 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reassigned NUTCH-1936:
---

Assignee: Lewis John McGibbney

 GSoC 2015 - Move Nutch to Hadoop 2.X
 

 Key: NUTCH-1936
 URL: https://issues.apache.org/jira/browse/NUTCH-1936
 Project: Nutch
  Issue Type: Task
  Components: build
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
  Labels: gsoc2015
 Fix For: 2.4, 1.11


 The Nutch PMC 
 [discussed|http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] 
 ideas for a good 2015 GSoC project. It appears that porting the (trunk) 
 codebase to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an 
 attractive option and one which would present an excellent learning 
 experience for a summer student.
 A more comprehensive description of this issue should be included within 
 either a mentor-defined project description or a successful student 
 application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X

2015-07-24 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney closed NUTCH-1936.
---
Resolution: Duplicate

In favor of NUTCH-2049

 GSoC 2015 - Move Nutch to Hadoop 2.X
 

 Key: NUTCH-1936
 URL: https://issues.apache.org/jira/browse/NUTCH-1936
 Project: Nutch
  Issue Type: Task
  Components: build
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
  Labels: gsoc2015
 Fix For: 2.4, 1.11

 Attachments: NUTCH-1939.patch


 The Nutch PMC 
 [discussed|http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] 
 ideas for a good 2015 GSoC project. It appears that porting the (trunk) 
 codebase to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an 
 attractive option and one which would present an excellent learning 
 experience for a summer student.
 A more comprehensive description of this issue should be included within 
 either a mentor-defined project description or a successful student 
 application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[Nutch Wiki] Update of GoogleSummerOfCode/Ideas 2015 - Move Nutch to Hadoop 2.X by HalilSimsek

2015-05-20 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Nutch Wiki for change 
notification.

The GoogleSummerOfCode/Ideas 2015 - Move Nutch to Hadoop 2.X page has been 
changed by HalilSimsek:
https://wiki.apache.org/nutch/GoogleSummerOfCode/Ideas%202015%20-%20Move%20Nutch%20to%20Hadoop%202.X

Comment:
Ideas page for GSoC 2015 is created

New page:
 NUTCH-1936 GSoC 2015 - Move Nutch to Hadoop 2.X 
= Description =
The Nutch PMC discussed ideas for a good 2015 GSoC project. It appears that 
porting the (trunk) codebase to Hadoop 2.X seems to an attractive option and 
one which would present an excellent learning experience for a summer student. 
Essentially this project will involve porting ALL aspects of every class which 
utilizes the Hadoop 1.X codebase to Hadoop 2.X. A good starting point would be 
to understand what has already been discussed in the existing 
[[https://wiki.apache.org/nutch/GoogleSummerOfCode#Jira_Issues|JIRA Issues]].

= Interested Mentors =
Lewis John McGibbney

= Student Proposals =
 * Name:
 * University:
 * Short Description of Interests in GSoC:
 * Proposal: Please attach as a PDF/Word Document and simply link to it from 
here.

= Reports =
TODO

= Documentation =
TODO

= Jira Issues =

 * [[https://issues.apache.org/jira/browse/NUTCH-1936|NUTCH-1936]] - GSoC 2015 
- Move Nutch to Hadoop 2.X (Parent Issue)
 * [[https://issues.apache.org/jira/browse/NUTCH-1219|NUTCH-1219]] - Upgrade 
all jobs to new MapReduce API


[jira] [Commented] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X

2015-03-15 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14362554#comment-14362554
 ] 

Lewis John McGibbney commented on NUTCH-1936:
-

Please start getting together a project proposal.
You should sign up to the Nutch wiki and add your proposal there.
https://wiki.apache.org/nutch/GoogleSummerOfCode
Thank you





-- 
*Lewis*


 GSoC 2015 - Move Nutch to Hadoop 2.X
 

 Key: NUTCH-1936
 URL: https://issues.apache.org/jira/browse/NUTCH-1936
 Project: Nutch
  Issue Type: Task
  Components: build
Reporter: Lewis John McGibbney
  Labels: gsoc2015
 Fix For: 2.4, 1.11


 The Nutch PMC 
 [discussed|http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] 
 ideas for a good 2015 GSoC project. It appears that porting the (trunk) 
 codebase to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an 
 attractive option and one which would present an excellent learning 
 experience for a summer student.
 A more comprehensive description of this issue should be included within 
 either a mentor-defined project description or a successful student 
 application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [jira] [Issue Comment Deleted] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X

2015-03-12 Thread Mohit Bagde
Hi,

My name is Mohit Bagde. I am currently doing my Master's in CS at USC. I
have taken CS572 Information Retrieval and Search Engines under Prof.
Mattmann and as have worked on Nutch 1.X as part of the first assignment
which involved crawling with Nutch and integrating with Tika and
subsequently developing a plugin in Nutch. I have also taken INF 550 under
Prof. Kim where I am learning about the HDFS and Map Reduce and I find that
both these subjects have a common point in the JIRA issue NUTCH-1936 which
is about porting Nutch to Hadoop 2.X.

My questions are, I would like to know on a very high level, what the
requirements for this project are? And what kind of background is required?
I would like to submit a project proposal but I am not entirely sure what
to put into it. I enjoyed working with Nutch and found the entire
experience to be very knowledgeable. I would like to continue to develop
and contribute to Nutch in any which way possible. I would be really
obliged if you could give some more insight into this JIRA issue.

Sincerely,

Mohit Bagde.

On Tue, Mar 10, 2015 at 9:54 PM, Ashwini Tokekar (JIRA) j...@apache.org
wrote:


  [
 https://issues.apache.org/jira/browse/NUTCH-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

 Ashwini Tokekar updated NUTCH-1936:
 ---
 Comment: was deleted

 (was: Thanks Lewis)

  GSoC 2015 - Move Nutch to Hadoop 2.X
  
 
  Key: NUTCH-1936
  URL: https://issues.apache.org/jira/browse/NUTCH-1936
  Project: Nutch
   Issue Type: Task
   Components: build
 Reporter: Lewis John McGibbney
   Labels: gsoc2015
  Fix For: 2.4, 1.11
 
 
  The Nutch PMC [discussed|
 http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] ideas
 for a good 2015 GSoC project. It appears that porting the (trunk) codebase
 to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an
 attractive option and one which would present an excellent learning
 experience for a summer student.
  A more comprehensive description of this issue should be included within
 either a mentor-defined project description or a successful student
 application.



 --
 This message was sent by Atlassian JIRA
 (v6.3.4#6332)




-- 

Mohit Bagde
Graduate Student,
Computer Science,
University of Southern California,
Los Angeles, CA 90007.


[jira] [Issue Comment Deleted] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X

2015-03-10 Thread Ashwini Tokekar (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashwini Tokekar updated NUTCH-1936:
---
Comment: was deleted

(was: Thanks Lewis)

 GSoC 2015 - Move Nutch to Hadoop 2.X
 

 Key: NUTCH-1936
 URL: https://issues.apache.org/jira/browse/NUTCH-1936
 Project: Nutch
  Issue Type: Task
  Components: build
Reporter: Lewis John McGibbney
  Labels: gsoc2015
 Fix For: 2.4, 1.11


 The Nutch PMC 
 [discussed|http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] 
 ideas for a good 2015 GSoC project. It appears that porting the (trunk) 
 codebase to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an 
 attractive option and one which would present an excellent learning 
 experience for a summer student.
 A more comprehensive description of this issue should be included within 
 either a mentor-defined project description or a successful student 
 application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X

2015-03-10 Thread Ashwini Tokekar (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356294#comment-14356294
 ] 

Ashwini Tokekar commented on NUTCH-1936:


Thanks Lewis

 GSoC 2015 - Move Nutch to Hadoop 2.X
 

 Key: NUTCH-1936
 URL: https://issues.apache.org/jira/browse/NUTCH-1936
 Project: Nutch
  Issue Type: Task
  Components: build
Reporter: Lewis John McGibbney
  Labels: gsoc2015
 Fix For: 2.4, 1.11


 The Nutch PMC 
 [discussed|http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] 
 ideas for a good 2015 GSoC project. It appears that porting the (trunk) 
 codebase to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an 
 attractive option and one which would present an excellent learning 
 experience for a summer student.
 A more comprehensive description of this issue should be included within 
 either a mentor-defined project description or a successful student 
 application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X

2015-03-10 Thread Ashwini Tokekar (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356295#comment-14356295
 ] 

Ashwini Tokekar commented on NUTCH-1936:


Thanks Lewis

 GSoC 2015 - Move Nutch to Hadoop 2.X
 

 Key: NUTCH-1936
 URL: https://issues.apache.org/jira/browse/NUTCH-1936
 Project: Nutch
  Issue Type: Task
  Components: build
Reporter: Lewis John McGibbney
  Labels: gsoc2015
 Fix For: 2.4, 1.11


 The Nutch PMC 
 [discussed|http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] 
 ideas for a good 2015 GSoC project. It appears that porting the (trunk) 
 codebase to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an 
 attractive option and one which would present an excellent learning 
 experience for a summer student.
 A more comprehensive description of this issue should be included within 
 either a mentor-defined project description or a successful student 
 application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X

2015-03-10 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355939#comment-14355939
 ] 

Lewis John McGibbney commented on NUTCH-1936:
-

Hi [~ashwini.tokekar]
bq. I have a request can you please share more details about what are your 
expected outcomes from this project. 
The Hadoop API is used in a pervasive fashion throughout Nutch trunk codebase. 
Right now the entire codebase relies upon Hadoop 1.X. The idea is to move EVERY 
tool, and every instance of every class which uses Hadoop 1.X API over to 
Hadoop 2.X.
I would suggest that you take time to look into the existing issues open for 
similar tasks e.g. https://wiki.apache.org/nutch/GoogleSummerOfCode#Jira_Issues

 GSoC 2015 - Move Nutch to Hadoop 2.X
 

 Key: NUTCH-1936
 URL: https://issues.apache.org/jira/browse/NUTCH-1936
 Project: Nutch
  Issue Type: Task
  Components: build
Reporter: Lewis John McGibbney
  Labels: gsoc2015
 Fix For: 2.4, 1.11


 The Nutch PMC 
 [discussed|http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] 
 ideas for a good 2015 GSoC project. It appears that porting the (trunk) 
 codebase to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an 
 attractive option and one which would present an excellent learning 
 experience for a summer student.
 A more comprehensive description of this issue should be included within 
 either a mentor-defined project description or a successful student 
 application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X

2015-03-10 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355931#comment-14355931
 ] 

Lewis John McGibbney commented on NUTCH-1936:
-

[~petr.shypila], you know can add to the wiki page. Thanks. You can put it here 
- 
https://wiki.apache.org/nutch/GoogleSummerOfCode#NUTCH-1936_GSoC_2015_-_Move_Nutch_to_Hadoop_2.X


 GSoC 2015 - Move Nutch to Hadoop 2.X
 

 Key: NUTCH-1936
 URL: https://issues.apache.org/jira/browse/NUTCH-1936
 Project: Nutch
  Issue Type: Task
  Components: build
Reporter: Lewis John McGibbney
  Labels: gsoc2015
 Fix For: 2.4, 1.11


 The Nutch PMC 
 [discussed|http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] 
 ideas for a good 2015 GSoC project. It appears that porting the (trunk) 
 codebase to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an 
 attractive option and one which would present an excellent learning 
 experience for a summer student.
 A more comprehensive description of this issue should be included within 
 either a mentor-defined project description or a successful student 
 application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X

2015-03-09 Thread Ashwini Tokekar (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353537#comment-14353537
 ] 

Ashwini Tokekar commented on NUTCH-1936:


Hi Lewis,
I am interested in this project. I have worked on Hadoop 1.0.3 in my 
undergraduate study. I have developed search engine using it. I have a request 
can you please share more details about what are your expected outcomes from 
this project. 

Ashwini
The link to my linkedin profile is 
:http://in.linkedin.com/pub/ashwini-tokekar/78/150/403

 GSoC 2015 - Move Nutch to Hadoop 2.X
 

 Key: NUTCH-1936
 URL: https://issues.apache.org/jira/browse/NUTCH-1936
 Project: Nutch
  Issue Type: Task
  Components: build
Reporter: Lewis John McGibbney
  Labels: gsoc2015
 Fix For: 2.4, 1.11


 The Nutch PMC 
 [discussed|http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] 
 ideas for a good 2015 GSoC project. It appears that porting the (trunk) 
 codebase to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an 
 attractive option and one which would present an excellent learning 
 experience for a summer student.
 A more comprehensive description of this issue should be included within 
 either a mentor-defined project description or a successful student 
 application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X

2015-03-02 Thread Petr Shypila (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14344560#comment-14344560
 ] 

Petr Shypila commented on NUTCH-1936:
-

Hi Lewis.

I'm very interested in this project as a part of GSoC. I have an experience in 
both Hadoop versions and more than a year of experience in production. So I 
think it will be not too hard for me, but also a challenge.

 GSoC 2015 - Move Nutch to Hadoop 2.X
 

 Key: NUTCH-1936
 URL: https://issues.apache.org/jira/browse/NUTCH-1936
 Project: Nutch
  Issue Type: Task
  Components: build
Reporter: Lewis John McGibbney
  Labels: gsoc2015
 Fix For: 2.4, 1.11


 The Nutch PMC 
 [discussed|http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] 
 ideas for a good 2015 GSoC project. It appears that porting the (trunk) 
 codebase to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an 
 attractive option and one which would present an excellent learning 
 experience for a summer student.
 A more comprehensive description of this issue should be included within 
 either a mentor-defined project description or a successful student 
 application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X

2015-03-02 Thread Petr Shypila (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14344589#comment-14344589
 ] 

Petr Shypila commented on NUTCH-1936:
-

Lews,

Thank you for your quick reply. My wiki username is *PetrShypila*. 
I'll try to create a proposal within a week(2 is a maximum) since I need a time 
to investigate this project more deeply.

Thanks,
Petr

 GSoC 2015 - Move Nutch to Hadoop 2.X
 

 Key: NUTCH-1936
 URL: https://issues.apache.org/jira/browse/NUTCH-1936
 Project: Nutch
  Issue Type: Task
  Components: build
Reporter: Lewis John McGibbney
  Labels: gsoc2015
 Fix For: 2.4, 1.11


 The Nutch PMC 
 [discussed|http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] 
 ideas for a good 2015 GSoC project. It appears that porting the (trunk) 
 codebase to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an 
 attractive option and one which would present an excellent learning 
 experience for a summer student.
 A more comprehensive description of this issue should be included within 
 either a mentor-defined project description or a successful student 
 application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X

2015-03-02 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14344573#comment-14344573
 ] 

Lewis John McGibbney commented on NUTCH-1936:
-

Hi Petr,
Grand.
Please begin preparing your proposal.
Can you please provide us with your wiki username and toucan develop your
proposal on the wiki?
Thanks
Lewis




-- 
*Lewis*


 GSoC 2015 - Move Nutch to Hadoop 2.X
 

 Key: NUTCH-1936
 URL: https://issues.apache.org/jira/browse/NUTCH-1936
 Project: Nutch
  Issue Type: Task
  Components: build
Reporter: Lewis John McGibbney
  Labels: gsoc2015
 Fix For: 2.4, 1.11


 The Nutch PMC 
 [discussed|http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] 
 ideas for a good 2015 GSoC project. It appears that porting the (trunk) 
 codebase to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an 
 attractive option and one which would present an excellent learning 
 experience for a summer student.
 A more comprehensive description of this issue should be included within 
 either a mentor-defined project description or a successful student 
 application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Move Nutch to Hadoop 2.X

2015-02-11 Thread Dulaj Viduranga
Hi,
My name is Dulaj Viduranga and I’m a 3rd year Computer Science and 
Engineering student at University of Moratuwa, Sri Lanka.
I’m excited about Move Nutch to Hadoop 2.X project and I would like 
to participate in contributing to the project. Also If you are willing to, I’m 
very excited to have this, as my GSoC 2015 project this summer.
Please let me know how to get involved.

Thank you.
Dulaj Viduranga.

Re: Move Nutch to Hadoop 2.X

2015-02-11 Thread Mattmann, Chris A (3980)
Great, Dulaj. I think one of the starting points would be to
work to engage via JIRA since I think Lewis has created a JIRA
issue for this and tagged the appropriate issue as gsoc2015.

We would welcome you via GSOC and I recommend you begin engaging
via JIRA to get started on your proposal ASAP.

Cheers and welcome!

Cheers,
Chris



++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++






-Original Message-
From: Dulaj Viduranga vidura...@icloud.com
Reply-To: dev@nutch.apache.org dev@nutch.apache.org
Date: Wednesday, February 11, 2015 at 6:25 AM
To: dev@nutch.apache.org dev@nutch.apache.org
Subject: Move Nutch to Hadoop 2.X

Hi,
   My name is Dulaj Viduranga and I’m a 3rd year Computer Science and
Engineering student at University of Moratuwa, Sri Lanka.
   I’m excited about Move Nutch to Hadoop 2.X project and I would like to
participate in contributing to the project. Also If you are willing to,
I’m very excited to have this, as my GSoC 2015 project this summer.
   Please let me know how to get involved.

Thank you.
Dulaj Viduranga.



[jira] [Commented] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X

2015-02-05 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307725#comment-14307725
 ] 

Lewis John McGibbney commented on NUTCH-1936:
-

https://wiki.apache.org/nutch/GoogleSummerOfCode#NUTCH-1936_GSoC_2015_-_Move_Nutch_to_Hadoop_2.X

 GSoC 2015 - Move Nutch to Hadoop 2.X
 

 Key: NUTCH-1936
 URL: https://issues.apache.org/jira/browse/NUTCH-1936
 Project: Nutch
  Issue Type: Task
  Components: build
Reporter: Lewis John McGibbney
  Labels: gsoc2015
 Fix For: 2.4, 1.11


 The Nutch PMC 
 [discussed|http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] 
 ideas for a good 2015 GSoC project. It appears that porting the (trunk) 
 codebase to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an 
 attractive option and one which would present an excellent learning 
 experience for a summer student.
 A more comprehensive description of this issue should be included within 
 either a mentor-defined project description or a successful student 
 application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X

2015-02-05 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created NUTCH-1936:
---

 Summary: GSoC 2015 - Move Nutch to Hadoop 2.X
 Key: NUTCH-1936
 URL: https://issues.apache.org/jira/browse/NUTCH-1936
 Project: Nutch
  Issue Type: Task
  Components: build
Reporter: Lewis John McGibbney
 Fix For: 2.4, 1.11


The Nutch PMC 
[discussed|http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] 
ideas for a good 2015 GSoC project. It appears that porting the (trunk) 
codebase to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an 
attractive option and one which would present an excellent learning experience 
for a summer student.

A more comprehensive description of this issue should be included within either 
a mentor-defined project description or a successful student application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)