[jira] Resolved: (HADOOP-2360) hadoop::RecordReader::read() throws exception in HadoopPipes::RecordWriter

2007-12-06 Thread Yiping Han (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiping Han resolved HADOOP-2360.


Resolution: Cannot Reproduce

> hadoop::RecordReader::read() throws exception in HadoopPipes::RecordWriter
> --
>
> Key: HADOOP-2360
> URL: https://issues.apache.org/jira/browse/HADOOP-2360
> Project: Hadoop
>  Issue Type: Bug
>Affects Versions: 0.14.3
>Reporter: Yiping Han
>Priority: Minor
>
> The jute record is in format:
>   class SampleValue 
>   {
>ustring data;
>   }
> And in HadoopPipes::RecordWriter::emit(), has code like this:
> void SampleRecordWriterC::emit(const std::string& key, const std::string& 
> value)
> {
> if (key.empty() || value.empty()) {
> return;
> }
> hadoop::StringInStream key_in_stream(const_cast(key));
> hadoop::RecordReader key_record_reader(key_in_stream, hadoop::kCSV);
> EmitKeyT emit_key;
> key_record_reader.read(emit_key);
> hadoop::StringInStream value_in_stream(const_cast(value));
> hadoop::RecordReader value_record_reader(value_in_stream, hadoop::kCSV);
> EmitValueT emit_value;
> value_record_reader.read(emit_value);
> return;
> }
> And the code throw hadoop::IOException at the read() line.
> In the mapper, I have faked record emitted by the following code:
> std::string value;
> EmitValueT emit_value;
> emit_value.getData().assign("FakeData");
> hadoop::StringOutStream value_out_stream(value);
> hadoop::RecordWriter value_record_writer(value_out_stream, hadoop::kCSV);
> value_record_writer.write(emit_value);
> We haven't update to the up-to-date version of hadoop. But I've searched the 
> tickets and didn't find one issuing this problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2360) hadoop::RecordReader::read() throws exception in HadoopPipes::RecordWriter

2007-12-06 Thread Yiping Han (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiping Han updated HADOOP-2360:
---

Priority: Minor  (was: Blocker)

> hadoop::RecordReader::read() throws exception in HadoopPipes::RecordWriter
> --
>
> Key: HADOOP-2360
> URL: https://issues.apache.org/jira/browse/HADOOP-2360
> Project: Hadoop
>  Issue Type: Bug
>Affects Versions: 0.14.3
>Reporter: Yiping Han
>Priority: Minor
>
> The jute record is in format:
>   class SampleValue 
>   {
>ustring data;
>   }
> And in HadoopPipes::RecordWriter::emit(), has code like this:
> void SampleRecordWriterC::emit(const std::string& key, const std::string& 
> value)
> {
> if (key.empty() || value.empty()) {
> return;
> }
> hadoop::StringInStream key_in_stream(const_cast(key));
> hadoop::RecordReader key_record_reader(key_in_stream, hadoop::kCSV);
> EmitKeyT emit_key;
> key_record_reader.read(emit_key);
> hadoop::StringInStream value_in_stream(const_cast(value));
> hadoop::RecordReader value_record_reader(value_in_stream, hadoop::kCSV);
> EmitValueT emit_value;
> value_record_reader.read(emit_value);
> return;
> }
> And the code throw hadoop::IOException at the read() line.
> In the mapper, I have faked record emitted by the following code:
> std::string value;
> EmitValueT emit_value;
> emit_value.getData().assign("FakeData");
> hadoop::StringOutStream value_out_stream(value);
> hadoop::RecordWriter value_record_writer(value_out_stream, hadoop::kCSV);
> value_record_writer.write(emit_value);
> We haven't update to the up-to-date version of hadoop. But I've searched the 
> tickets and didn't find one issuing this problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HADOOP-2360) hadoop::RecordReader::read() throws exception in HadoopPipes::RecordWriter

2007-12-05 Thread Yiping Han (JIRA)
hadoop::RecordReader::read() throws exception in HadoopPipes::RecordWriter
--

 Key: HADOOP-2360
 URL: https://issues.apache.org/jira/browse/HADOOP-2360
 Project: Hadoop
  Issue Type: Bug
Affects Versions: 0.14.3
Reporter: Yiping Han
Priority: Blocker


The jute record is in format:

  class SampleValue 
  {
   ustring data;
  }

And in HadoopPipes::RecordWriter::emit(), has code like this:

void SampleRecordWriterC::emit(const std::string& key, const std::string& value)
{
if (key.empty() || value.empty()) {
return;
}

hadoop::StringInStream key_in_stream(const_cast(key));
hadoop::RecordReader key_record_reader(key_in_stream, hadoop::kCSV);
EmitKeyT emit_key;
key_record_reader.read(emit_key);

hadoop::StringInStream value_in_stream(const_cast(value));
hadoop::RecordReader value_record_reader(value_in_stream, hadoop::kCSV);
EmitValueT emit_value;

value_record_reader.read(emit_value);

return;
}

And the code throw hadoop::IOException at the read() line.


In the mapper, I have faked record emitted by the following code:

std::string value;
EmitValueT emit_value;

emit_value.getData().assign("FakeData");

hadoop::StringOutStream value_out_stream(value);
hadoop::RecordWriter value_record_writer(value_out_stream, hadoop::kCSV);
value_record_writer.write(emit_value);

We haven't update to the up-to-date version of hadoop. But I've searched the 
tickets and didn't find one issuing this problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HADOOP-2162) Provide last failure point when retry a mapper task

2007-11-06 Thread Yiping Han (JIRA)
Provide last failure point when retry a mapper task
---

 Key: HADOOP-2162
 URL: https://issues.apache.org/jira/browse/HADOOP-2162
 Project: Hadoop
  Issue Type: New Feature
Reporter: Yiping Han


Currently when a mapper failed and get restarted, the restarted mapper can find 
out if itself is a retry or the first try from the task name. We want to also 
know where in the input the last try failed. 

With the last failure point, our mapper can then do something different for the 
particular input. The failure point does not necessary to be very accurate. The 
reason we ask for this instead of let hadoop to deal with the failure record 
is, in such a way we can do something special for the failure record instead of 
simply skip it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-1864) Support for big jar file (>2G)

2007-10-25 Thread Yiping Han (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537755
 ] 

Yiping Han commented on HADOOP-1864:


Milind,

Yes. Either this issue or 2019 should satisfy our requirement.

> Support for big jar file (>2G)
> --
>
> Key: HADOOP-1864
> URL: https://issues.apache.org/jira/browse/HADOOP-1864
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.1
>Reporter: Yiping Han
>Priority: Critical
>
> We have huge size binary that need to be distributed onto tasktracker nodes 
> in Hadoop streaming mode. We've tried both -file option and -cacheArchive 
> option. It seems the tasktracker node cannot unjar jar files bigger than 2G. 
> We are considering split our binaries into multiple jars, but with -file, it 
> seems we cannot do it. Also, we would prefer -cacheArchive option for 
> performance issue, but it seems -cacheArchive does not allow more than 
> appearance in the streaming options. Even if -cacheArchive support multiple 
> jars, we still need a way to put the jars into a single directory tree, 
> instead of using multiple symbolic links. 
> So, in general, we need a feasible and efficient way to update large size 
> (>2G) binaries for Hadoop streaming. Don't know if there is an existing 
> solution that we either didn't find or took it wrong. Or there should be some 
> extra work to provide a solution?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-1864) Support for big jar file (>2G)

2007-10-22 Thread Yiping Han (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiping Han updated HADOOP-1864:
---

 Priority: Critical  (was: Major)
Affects Version/s: 0.14.1

We are now raising our expectation for the fix. I've confirmed Java v1.6 does 
not solve this problem.

> Support for big jar file (>2G)
> --
>
> Key: HADOOP-1864
> URL: https://issues.apache.org/jira/browse/HADOOP-1864
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.1
>Reporter: Yiping Han
>Priority: Critical
>
> We have huge size binary that need to be distributed onto tasktracker nodes 
> in Hadoop streaming mode. We've tried both -file option and -cacheArchive 
> option. It seems the tasktracker node cannot unjar jar files bigger than 2G. 
> We are considering split our binaries into multiple jars, but with -file, it 
> seems we cannot do it. Also, we would prefer -cacheArchive option for 
> performance issue, but it seems -cacheArchive does not allow more than 
> appearance in the streaming options. Even if -cacheArchive support multiple 
> jars, we still need a way to put the jars into a single directory tree, 
> instead of using multiple symbolic links. 
> So, in general, we need a feasible and efficient way to update large size 
> (>2G) binaries for Hadoop streaming. Don't know if there is an existing 
> solution that we either didn't find or took it wrong. Or there should be some 
> extra work to provide a solution?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-1865) "org.apache.hadoop.metrics.jvm.EventCounter" not instantiate error

2007-09-13 Thread Yiping Han (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiping Han updated HADOOP-1865:
---

Description: 
Hi, 

I got the "org.apache.hadoop.metrics.jvm.EventCounter" not instantiate error. 
This error happens for every hadoop command. But it seems it does not block any 
operation to success. Don't know if anyone has an idea?


  was:
Hi, 

I got the following error for every command I run on hadoop. But it seems the 
command still work. Can you help to find out what's wrong here? Thanks!

bash-3.00$ bin/start-all.sh
starting namenode, logging to 
/export/crawlspace/yhan/hadoop/hadoop-0.13.1/bin/../logs/hadoop-yhan-namenode-idev43.out
log4j:ERROR Could not instantiate class 
[org.apache.hadoop.metrics.jvm.EventCounter].
java.lang.ClassNotFoundException: org.apache.hadoop.metrics.jvm.EventCounter
at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:276)
at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)
at java.lang.Class.forName0(Native Method)



> "org.apache.hadoop.metrics.jvm.EventCounter" not instantiate error
> --
>
> Key: HADOOP-1865
> URL: https://issues.apache.org/jira/browse/HADOOP-1865
> Project: Hadoop
>  Issue Type: Bug
>Affects Versions: 0.13.1
>Reporter: Yiping Han
>Priority: Critical
>
> Hi, 
> I got the "org.apache.hadoop.metrics.jvm.EventCounter" not instantiate error. 
> This error happens for every hadoop command. But it seems it does not block 
> any operation to success. Don't know if anyone has an idea?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HADOOP-1865) "org.apache.hadoop.metrics.jvm.EventCounter" not instantiate error

2007-09-10 Thread Yiping Han (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiping Han resolved HADOOP-1865.


Resolution: Fixed

This seems to related to the configuration files that I modified.

> "org.apache.hadoop.metrics.jvm.EventCounter" not instantiate error
> --
>
> Key: HADOOP-1865
> URL: https://issues.apache.org/jira/browse/HADOOP-1865
> Project: Hadoop
>  Issue Type: Bug
>Affects Versions: 0.13.1
>Reporter: Yiping Han
>Priority: Critical
>
> Hi, 
> I got the following error for every command I run on hadoop. But it seems the 
> command still work. Can you help to find out what's wrong here? Thanks!
> bash-3.00$ bin/start-all.sh
> starting namenode, logging to 
> /export/crawlspace/yhan/hadoop/hadoop-0.13.1/bin/../logs/hadoop-yhan-namenode-idev43.out
> log4j:ERROR Could not instantiate class 
> [org.apache.hadoop.metrics.jvm.EventCounter].
> java.lang.ClassNotFoundException: org.apache.hadoop.metrics.jvm.EventCounter
> at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:276)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
> at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)
> at java.lang.Class.forName0(Native Method)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HADOOP-1865) "org.apache.hadoop.metrics.jvm.EventCounter" not instantiate error

2007-09-09 Thread Yiping Han (JIRA)
"org.apache.hadoop.metrics.jvm.EventCounter" not instantiate error
--

 Key: HADOOP-1865
 URL: https://issues.apache.org/jira/browse/HADOOP-1865
 Project: Hadoop
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Yiping Han
Priority: Critical


Hi, 

I got the following error for every command I run on hadoop. But it seems the 
command still work. Can you help to find out what's wrong here? Thanks!

bash-3.00$ bin/start-all.sh
starting namenode, logging to 
/export/crawlspace/yhan/hadoop/hadoop-0.13.1/bin/../logs/hadoop-yhan-namenode-idev43.out
log4j:ERROR Could not instantiate class 
[org.apache.hadoop.metrics.jvm.EventCounter].
java.lang.ClassNotFoundException: org.apache.hadoop.metrics.jvm.EventCounter
at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:276)
at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)
at java.lang.Class.forName0(Native Method)


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HADOOP-1864) Support for big jar file (>2G)

2007-09-07 Thread Yiping Han (JIRA)
Support for big jar file (>2G)
--

 Key: HADOOP-1864
 URL: https://issues.apache.org/jira/browse/HADOOP-1864
 Project: Hadoop
  Issue Type: Bug
  Components: contrib/streaming
Reporter: Yiping Han


We have huge size binary that need to be distributed onto tasktracker nodes in 
Hadoop streaming mode. We've tried both -file option and -cacheArchive option. 
It seems the tasktracker node cannot unjar jar files bigger than 2G. We are 
considering split our binaries into multiple jars, but with -file, it seems we 
cannot do it. Also, we would prefer -cacheArchive option for performance issue, 
but it seems -cacheArchive does not allow more than appearance in the streaming 
options. Even if -cacheArchive support multiple jars, we still need a way to 
put the jars into a single directory tree, instead of using multiple symbolic 
links. 

So, in general, we need a feasible and efficient way to update large size (>2G) 
binaries for Hadoop streaming. Don't know if there is an existing solution that 
we either didn't find or took it wrong. Or there should be some extra work to 
provide a solution?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.