date:20100610

how to decode the metadata file of a block

2010-06-10 Thread Vidur Goyal

Hi,

Can somebody give me some insight of how to read the contents of metadata
file using hdfs api's and the encoding that's being used.

Thanks,
Vidur

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

Is it possible ....!!!

2010-06-10 Thread Ahmad Shahzad

Hi,
I wanted to ask if it is possible to intercept every communication that
takes place between hadoop's map reduce task i.e between JobTracker and
TaskTracker and make it pass through my own communication library.
So, if JobTracker and TaskTracker talk through http or rpc, i would like to
intercept the call and let it pass through my communication library. If it
is possible can anyone tell me that which set of classes i need to look at
hadoop's distribution.

Similarly, for the hdfs, is it possible to let all the communication that is
happening between namenode and datanode to pass through my communication
library.

Reason for doing that is that i want all the communication to happen through
a communication library that resolves every communication problem that we
can have e.g firewalls, NAT, non routed paths, multi homing etc etc. By
using that library all the headache of communication will be gone. So, we
will be able to use hadoop quite easily and there will be no communication
problems.

Thats my master's project. So, i want to know how to start and where to look
for.

I would really appreciate a reply.

Regards,
Ahmad Shahzad

Re: Is it possible ....!!! COOL!

2010-06-10 Thread hmarti2

Hey,

This is a really neat idea if anyone has a way to do this, could you
share?
I'll bet this could be very interesting! Thanks...

Best,
HAL


 Hi,
 I wanted to ask if it is possible to intercept every communication
 that
 takes place between hadoop's map reduce task i.e between JobTracker and
 TaskTracker and make it pass through my own communication library.
 So, if JobTracker and TaskTracker talk through http or rpc, i would like
 to
 intercept the call and let it pass through my communication library. If it
 is possible can anyone tell me that which set of classes i need to look at
 hadoop's distribution.

 Similarly, for the hdfs, is it possible to let all the communication that
 is
 happening between namenode and datanode to pass through my communication
 library.

 Reason for doing that is that i want all the communication to happen
 through
 a communication library that resolves every communication problem that we
 can have e.g firewalls, NAT, non routed paths, multi homing etc etc. By
 using that library all the headache of communication will be gone. So, we
 will be able to use hadoop quite easily and there will be no communication
 problems.

 Thats my master's project. So, i want to know how to start and where to
 look
 for.

 I would really appreciate a reply.

 Regards,
 Ahmad Shahzad

Re: Is it possible ....!!! COOL!

2010-06-10 Thread Ryan Smith

Sounds like it could be a SPOF.

On Thu, Jun 10, 2010 at 7:47 AM, hmar...@umbc.edu wrote:

 Hey,

 This is a really neat idea if anyone has a way to do this, could you
 share?
 I'll bet this could be very interesting! Thanks...

 Best,
 HAL


  Hi,
  I wanted to ask if it is possible to intercept every communication
  that
  takes place between hadoop's map reduce task i.e between JobTracker and
  TaskTracker and make it pass through my own communication library.
  So, if JobTracker and TaskTracker talk through http or rpc, i would like
  to
  intercept the call and let it pass through my communication library. If
 it
  is possible can anyone tell me that which set of classes i need to look
 at
  hadoop's distribution.
 
  Similarly, for the hdfs, is it possible to let all the communication that
  is
  happening between namenode and datanode to pass through my communication
  library.
 
  Reason for doing that is that i want all the communication to happen
  through
  a communication library that resolves every communication problem that we
  can have e.g firewalls, NAT, non routed paths, multi homing etc etc. By
  using that library all the headache of communication will be gone. So, we
  will be able to use hadoop quite easily and there will be no
 communication
  problems.
 
  Thats my master's project. So, i want to know how to start and where to
  look
  for.
 
  I would really appreciate a reply.
 
  Regards,
  Ahmad Shahzad

Appending and seeking files while writing

2010-06-10 Thread Stas Oskin

Hi.

Was the append functionality finally added to 0.20.1 version?

Also, is the ability to seek file being written and write data in other
place also supported?

Thanks in advance!

Java run-time error while executing my application - unable to find the files on the HDFS

2010-06-10 Thread samanthula


Hello friends,
I have built my own java application that performs some
map-reduce operations on the input files. I have loaded my files into HDFS
whose path is as follows:
/user/sam/input/1.txt
/user/sam/input/corrected
/user/sam/input/in

when i used the command $hadoop dfs -cat /user/sam/input/1.txt.. it outputs
the contents of the file correctly. My application uses the files on HDFS as
java strings as follows
String str = hdfs://192.168.1.1:9000/user/sam/input

String file1 = str + 1.txt
String file2 = str + Corrected

Here file1 file2 are fed as input to my mapper functions. After i started my
daemons, i ran my application as follows:
$hadoop jar maximum.jar /user/sam/input/in output

It is generating an error as follows 
Java.io.FileNotFoundException: hdfs://192.168.1.1:9000/user/sam/input/1.txt
(No such file or directory)

But, when i type $hadoop dfs -cat
hdfs://192.168.1.1:9000/user/sam/input/1.txt . it outputs the contents
of the file correctly.

I tried other possible ways as follows:
String str = /user/sam/input/
String str = hdfs:/user/sam/input

But none of the above paths works. 

Could anyone point out the possible mistake. Any kind of suggestions are
welcome. 

Thanks,
Bharath


-- 
View this message in context: 
http://old.nabble.com/Java-run-time-error-while-executing-my-application---unable-to-find-the-files-on-the-HDFS-tp28843314p28843314.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Re: Is it possible ....!!!

2010-06-10 Thread Allen Wittenauer


On Jun 10, 2010, at 3:25 AM, Ahmad Shahzad wrote:
 Reason for doing that is that i want all the communication to happen through
 a communication library that resolves every communication problem that we
 can have e.g firewalls, NAT, non routed paths, multi homing etc etc. By
 using that library all the headache of communication will be gone. So, we
 will be able to use hadoop quite easily and there will be no communication
 problems.

I know Owen pointed you towards using proxies, but anything remotely complex 
would probably be better in an interposer library, as then it is application 
agnostic.

Help:how to read a xml file in hadoop framework

2010-06-10 Thread Jander

Dear all,

I need to read a xml file in my application. When i run it as an application in 
eclipse, it runs correct, but when I run it on Hadoop, it gives the error 
message:

10/06/10 15:52:52 INFO input.FileInputFormat: Total input paths to process : 22
10/06/10 15:52:53 INFO mapred.JobClient: Running job: job_201006101455_0010
10/06/10 15:52:54 INFO mapred.JobClient:  map 0% reduce 0%
10/06/10 15:53:07 INFO mapred.JobClient: Task Id : 
attempt_201006101455_0010_m_00_0, Status : FAILED
java.lang.NullPointerException
at WordCountMapper.map(WordCountMapper.java:61)
at WordCountMapper.map(WordCountMapper.java:1)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)

attempt_201006101455_0010_m_00_0: java.io.FileNotFoundException: 
/tmp/hadoop-hadoop/mapred/local/taskTracker/jobcache/job_201006101455_0010/attempt_201006101455_0010_m_00_0/work/readme.xml
 (No such file or directory)
attempt_201006101455_0010_m_00_0: at 
java.io.FileInputStream.open(Native Method)
attempt_201006101455_0010_m_00_0: at 
java.io.FileInputStream.init(FileInputStream.java:106)
.

From the error message, We can see the xml file can't be found.
Hope you can help me, thanks.

Best regards 

Jander

Re: Is it possible ....!!!

2010-06-10 Thread Owen O'Malley

You can define your own socket factory by setting the configuration parameter:

hadoop.rpc.socket.factory.class.default

to a class name of a SocketFactory. It is also possible to define
socket factories on a protocol by protocol basis. Look at the code in
NetUtils.getSocketFactory.

-- Owen

Re: Is it possible ....!!!

2010-06-10 Thread Steve Loughran


Aaron Kimball wrote:

Hadoop has some classes for controlling how sockets are used. See
org.apache.hadoop.net.StandardSocketFactory, SocksSocketFactory.

The socket factory implementation chosen is controlled by the
hadoop.rpc.socket.factory.class.default configuration parameter. You could
probably write your own SocketFactory that gives back socket implementations
that tee the conversation to another port, or to a file, etc.

So, it's possible, but I don't know that anyone's implemented this. I
think others may have examined Hadoop's protocols via wireshark or other
external tools, but those don't have much insight into Hadoop's internals.
(Neither, for that matter, would the socket factory. You'd probably need to
be pretty clever to introspect as to exactly what type of message is being
sent and actually do semantic analysis, etc.)


also worry about anything opening a URL, for which there are JVM-level 
factories, and Jetty which opens its own listeners, though presumably 
its the clients you'd want to play with.


I'm going to be honest and say this is a fairly ambitious project for a 
master's thesis because you are going to be nestling deep into code 
across the system, possibly making changes whose benefits people who run 
well managed datacentres won't see the benefit of (they don't have 
connectivity problems as they set up the machines and the network 
properly, it's only people like me whose home desktop is badly 
configured ( https://issues.apache.org/jira/browse/HADOOP-3426 )


Now, what might be handy is better diagnostics of the configuration,
 1. code to run on every machine to test the network, look at the 
config, play with DNS, detect problems and report them with meaningful 
errors that point to wiki pages with hints
 2. every service which opens ports to log this event somewhere 
(ideally to a service base class) so instead of trying to work out which 
ports hadoop is using by playing with netstat -p and jps -v, you can 
make a query of the nodes (command line, signal and GET /ports) and get 
each services list of active protocols, ports and IP addresses as text 
or JSON.
 3. some class to take that JSON list and then try to access the 
various things, log failures

 4. Some MR jobs to run the code in (3) and see what happens
 5. Some MR jobs whose aim in life is to measure network bandwidth and 
do stats on round trip times.


Just a thought :)


See also some thoughts of mine on Hadoop/university collaboration
http://www.slideshare.net/steve_l/hadoop-and-universities

Re: just because you can, it doesn't mean you should....

2010-06-10 Thread hmarti2

All,

Okay, I was being facetious earlier with the 'COOL' comment.

This is a very bad idea. Well, not so much bad, but think about the
ramifications of what you are proposing. Putting a 'comm' code lib together
that facilitates comms and 'helps' with architecture issues also creates a
a SPOF (as another gent pointed out); moreover, it creates a nice target
for exploitation as the lib will undoubtedly become a repository of embedded
passwords, alternate dummy accounts, bypass routes, and all sorts of goop
to make things 'easier'. And since is has to be world readable, and easy
to get
access to, it will be very tough to protect - or easy to DoS/DDoS. Anything
and everything from random timing attacks, substitution spoofs, TOUTOCs,
you name it.

This whole thing is already a very nice open highway to distribute
embedded and
tunneled 'items' of a certain unnatural nature, don't try to override what
little
security you have already by 'punching holes in the firewall' and other
silly stuff.

Long run, what might be better is a discovery agent that provides
continual validation
of paths and service availability specific to Hadoop and sub programs.
That way any
outage or problem can be immediately addressed or brought to the attention
of the
SysAds/Networkers. Like a service monitoring program. Just don't make it
simple for
the 'hats out there to own you in under five minutes flat (especially with
an rpc or soap call
to some lib or flat file - and ssh/ssl abso-lu-tely does not matter, trust
me). You can disagree, and I really don't mean to be a 'buzz kill', but if
you ask your local 'Sherrif',
I think you'll be advised not to pursue this path too heavily.

Have a good computational day...

Best, Hal


 Hadoop has some classes for controlling how sockets are used. See
org.apache.hadoop.net.StandardSocketFactory, SocksSocketFactory.

 The socket factory implementation chosen is controlled by the
 hadoop.rpc.socket.factory.class.default configuration parameter. You
could
 probably write your own SocketFactory that gives back socket
 implementations
 that tee the conversation to another port, or to a file, etc.

 So, it's possible, but I don't know that anyone's implemented this. I
think others may have examined Hadoop's protocols via wireshark or other
external tools, but those don't have much insight into Hadoop's
internals.
 (Neither, for that matter, would the socket factory. You'd probably need to
 be pretty clever to introspect as to exactly what type of message is
being
 sent and actually do semantic analysis, etc.)

 Allen's suggestion is probably more correct, but might incur
additional
 work on your part.

 Cheers,
 - Aaron

 On Thu, Jun 10, 2010 at 3:54 PM, Allen Wittenauer
 awittena...@linkedin.comwrote:

 On Jun 10, 2010, at 3:25 AM, Ahmad Shahzad wrote:
  Reason for doing that is that i want all the communication to happen
 through
  a communication library that resolves every communication problem
that
 we
  can have e.g firewalls, NAT, non routed paths, multi homing etc etc.
 By
  using that library all the headache of communication will be gone.
So,
 we
  will be able to use hadoop quite easily and there will be no
 communication
  problems.
 I know Owen pointed you towards using proxies, but anything remotely
complex would probably be better in an interposer library, as then it
is
 application agnostic.

Re: the same key in different reducers

2010-06-10 Thread Oleg Ruchovets

Hi and thank you for the answers. I didn't check the email and now I see 7
answers. It is really great.

  Let me explain in more details why I am asking so strange question :-)

As I wrote before I write to HBase using Hadoop Job. Actually the writing
process executes in reducers part of HADOOP job.
Assuming that I have 3 reducers (all of them writes to HBase) and suppose 1
reducer and 3 reducer has the same key.
In this case I need to check: does HBase already contains such key ( it
required select operation from HBase). If yes I have to merge already
inserted record and after that writes it back to HBase. BUT in my case
information organized in such way that I have no problem with the same keys.
So I can save expensive HBase select operation , meaning using only insert
operations. But in order to use only insert operation I need to know that
every and every reducer have unique output key (  K3 is unique output key
for every and every reducer)

input: InputFormatK1,V1
mapper: MapperK1,V1,K2,V2
combiner: ReducerK2,V2,K2,V2
reducer: ReducerK2,V2,K3,V3
output: RecordWriterK3,V3



On Thu, Jun 10, 2010 at 12:40 AM, James Seigel ja...@tynt.com wrote:

 Oleg,

 Are you wanting to have them in different reducers?  If so then you can
 write a Comparable object to make that happen.

 If you want them to be on the same reducer, then that is what hadoop will
 do.

 :)


 On 2010-06-09, at 3:06 PM, Ted Yu wrote:

  Can you disclose more about how K3 is generated.
  From your description below, it is possible.
 
  On Wed, Jun 9, 2010 at 1:17 AM, Oleg Ruchovets oruchov...@gmail.com
 wrote:
 
  Hi ,
  My hadoop job writes results of map/reduce to HBase.
  I have 3 reducers.
 
  Here is a sequence of input and output parameters for Mapper , Combiner
 and
  Reducer
*input: InputFormatK1,V1
mapper: MapperK1,V1,K2,V2
combiner: ReducerK2,V2,K2,V2
reducer: ReducerK2,V2,K3,V3
output: RecordWriterK3,V3
 
  *My question:
  Is it possible that more than one reducer has the same output key K3.
  Meaning in case I have 3 reducers is it possible that
  reducer1K3 -* 1* , V3 [1,2,3]
  reducer2K3 - 2 , V3 [5,6,9]
  reducer3K3 - *1* , V3 [10,15,22]
 
  As you can see reducer1 has K3 - 1 and reducer3 has K3 - 1.
  So is that case possible or every and every reducer has unique output
 key?
 
  Thanks in advance
  Oleg.

Re: Delivery Status Notification (Failure)

2010-06-10 Thread Edson Ramiro

Hi Simon,

MapReduce is a framework developed by Google that uses a
programming model based in two functions called Map and Reduce,

Both the framework and the programming model are called MapReduce, right?

Hadoop is an open-source implementation of MapReduce.

HTH,

--
Edson Ramiro Lucas Filho
http://www.inf.ufpr.br/erlf07/


On 10 June 2010 17:40, Simon Narowki simon.naro...@gmail.com wrote:

 Thanks Abhishek for your answer. But sorry still I don't understand... What
 do you mean by the the runtime/programming support needed for MapReduce?

 Could you please mention some other implementations of MapReduce?

 Cheers
 Simon


 On Thu, Jun 10, 2010 at 10:35 PM, abhishek sharma absha...@usc.edu
 wrote:

  Hadoop is an open source implementation of the runtime/programming
  support needed for MapReduce.
  Several different implementations of MapReduce are possible. Google
  has its own that is different from Hadoop.
 
  Abhishek
 
  On Thu, Jun 10, 2010 at 1:32 PM, Simon Narowki simon.naro...@gmail.com
  wrote:
   Dear all,
  
   I am a new Hadoop user and am confused a little bit about the
 difference
   between Hadoop and MapReduce. Could anyone please clear me?
  
   Thanks!
   Simon

copyToLocal

2010-06-10 Thread Joseph Stein

Hi, so ok  am using copyToLocal through an automation script we have and
seeing odd results.

I am not sure if this is something I am doing wrong, defect, or known good
reason for it.  Let me know I would like to correct this in either my own
script, happy to give a try in the fs code fixing a bug or my own brain in
understanding because there is some good reason for this.

scenario

hadoop fs -copyToLocal event/2010_06_10/81ae7c24745211df9f6d002590008422
/data/2010_06_10/81ae7c24745211df9f6d002590008422

is resulting in my part files showing up in *
/data/2010_06_10/81ae7c24745211df9f6d002590008422/81ae7c24745211df9f6d002590008422
*
if i try [Hadoop 0.20.1]

hadoop fs -copyToLocal event/2010_06_10/81ae7c24745211df9f6d002590008422
/data/2010_06_10

is resulting in my part files showing up in just */data/2010_06_10*
(with no creation of the UUID directory like it did before)

my desired result is to have the files

from *event/2010_06_10/81ae7c24745211df9f6d002590008422

*and end up in */data/2010_06_10/81ae7c24745211df9f6d002590008422*

with the trailing directory creating itself like it does in the first
scenario but not duplicating it as it doesweirdly.

/*
Joe Stein
http://www.linkedin.com/in/charmalloc
Twitter: @allthingshadoop
*/

Re: Delivery Status Notification (Failure)

2010-06-10 Thread abhishek sharma

Hadoop is an open source implementation of the runtime/programming
support needed for MapReduce.
Several different implementations of MapReduce are possible. Google
has its own that is different from Hadoop.

Abhishek

On Thu, Jun 10, 2010 at 1:32 PM, Simon Narowki simon.naro...@gmail.com wrote:
 Dear all,

 I am a new Hadoop user and am confused a little bit about the difference
 between Hadoop and MapReduce. Could anyone please clear me?

 Thanks!
 Simon

present at seajug?

2010-06-10 Thread Nimret Sandhu

hey guys,

anyone from your group interested in presenting on hadoop or something related 
to seajug by any chance?

cheers,
-- 
Nimret Sandhu
http://www.nimret.com
http://www.nimsoft.biz

On Wednesday, June 09, 2010 06:41:07 pm Sean Jensen-Grey wrote:
 Hello Fellow Hadoopists,
 
 We are meeting at 7:15 pm on June 17th at the
 
 University Heights Community Center
 5031 University Way NE
 Seattle WA 98105
 Room #110
 
 We are looking for people to present. So you would like to get the
 word out please contact either myself or Chris Wilkes.
 
 The meetings are informal and highly conversational. If you have
 questions about Hadoop and map reduce this is a great
 place to ask them.
 
 Sean Jensen-Grey se...@seattlehadoop.org
 Chris Wilkes cwil...@seattlehadoop.org
 
   Sean  Chris
 
 Seattle Hadoop Distributing Computing User Meeting
 ==
 Bringing Hadoopists Together On the 3rd
 Thursday of the Month
 
 We focus predominately on distributed data
 processing using a map reduce style. The meetings
 are open to all and free of charge.
 
 When: Thursday June 17th, 7:15 prompt start - 8:45
 Where: University Heights Community Center, Room 110
 
 Outline for June 17th:
 
 Hands on Pig UDF: Chris Wilkes will walk
 through code samples on how to create
 and use your own User Defined Functions
 with Pig.
 
 Compute Bound map reduce: Sean Jensen-Grey
 will show some strategies for running
 compute bound tasks on Hadoop.
 
 Please sign up to the list annou...@seattlehadoop.org for late
 breaking meeting information and post meeting communication.
 
 Subscribe via email
 
 seattlehadoop-announce+subscr...@googlegroups.com
 
 or
 
 http://groups.google.com/group/seattlehadoop-announce
 
 Regards,
 
 Sean  Chris
 
 http://seattlehadoop.org/

Re: Delivery Status Notification (Failure)

2010-06-10 Thread Zeev Milin

You can find more about MapReduce here:
http://labs.google.com/papers/mapreduce.html
Some of the implementations (like Hadoop) are listed on this page:
http://en.wikipedia.org/wiki/MapReduce

Zeev


On Thu, Jun 10, 2010 at 1:32 PM, Simon Narowki simon.naro...@gmail.comwrote:

 Dear all,

 I am a new Hadoop user and am confused a little bit about the difference
 between Hadoop and MapReduce. Could anyone please clear me?

 Thanks!
 Simon

Re: Delivery Status Notification (Failure)

2010-06-10 Thread Simon Narowki

Hi Edson,

Thank you for the answer. That's right MapReduce is the Google framework
based on two functions Map and Reduce. If I understood it correctly, Hadoop
is an implementation of Map and Reduce functions in MapReduce. My question
is: Does Hadoop includes MapReduce framework of Google as well?


Regards
Simon



On Thu, Jun 10, 2010 at 10:44 PM, Edson Ramiro erlfi...@gmail.com wrote:

 Hi Simon,

 MapReduce is a framework developed by Google that uses a
 programming model based in two functions called Map and Reduce,

 Both the framework and the programming model are called MapReduce, right?

 Hadoop is an open-source implementation of MapReduce.

 HTH,

 --
 Edson Ramiro Lucas Filho
 http://www.inf.ufpr.br/erlf07/


 On 10 June 2010 17:40, Simon Narowki simon.naro...@gmail.com wrote:

  Thanks Abhishek for your answer. But sorry still I don't understand...
 What
  do you mean by the the runtime/programming support needed for
 MapReduce?
 
  Could you please mention some other implementations of MapReduce?
 
  Cheers
  Simon
 
 
  On Thu, Jun 10, 2010 at 10:35 PM, abhishek sharma absha...@usc.edu
  wrote:
 
   Hadoop is an open source implementation of the runtime/programming
   support needed for MapReduce.
   Several different implementations of MapReduce are possible. Google
   has its own that is different from Hadoop.
  
   Abhishek
  
   On Thu, Jun 10, 2010 at 1:32 PM, Simon Narowki 
 simon.naro...@gmail.com
   wrote:
Dear all,
   
I am a new Hadoop user and am confused a little bit about the
  difference
between Hadoop and MapReduce. Could anyone please clear me?
   
Thanks!
Simon

Delivery Status Notification (Failure)

2010-06-10 Thread Simon Narowki

Dear all,

I am a new Hadoop user and am confused a little bit about the difference
between Hadoop and MapReduce. Could anyone please clear me?

Thanks!
Simon

Re: copyToLocal

2010-06-10 Thread Joseph Stein

I think 3 weeks of no sleep caused this.  My automation script failed
leaving the directory there so when it re-ran THEN it caused this weirdness.

I guess copyToLocal if it sees a directory already existing it then appends
the directory as a child to the local (so in my first scenario
/data/2010_06_10/81ae7c24745211df9f6d002590008422 already was existing
because my script created it the first time).

Still odd behavior, whatever it is fine sorry to bother... i just added to
my automation script to remove the directory before i do a copyToLocal.


On Thu, Jun 10, 2010 at 7:12 PM, Joseph Stein crypt...@gmail.com wrote:

 Hi, so ok  am using copyToLocal through an automation script we have and
 seeing odd results.

 I am not sure if this is something I am doing wrong, defect, or known good
 reason for it.  Let me know I would like to correct this in either my own
 script, happy to give a try in the fs code fixing a bug or my own brain in
 understanding because there is some good reason for this.

 scenario

 hadoop fs -copyToLocal event/2010_06_10/81ae7c24745211df9f6d002590008422
 /data/2010_06_10/81ae7c24745211df9f6d002590008422

 is resulting in my part files showing up in *
 /data/2010_06_10/81ae7c24745211df9f6d002590008422/81ae7c24745211df9f6d002590008422
 *
 if i try [Hadoop 0.20.1]

 hadoop fs -copyToLocal event/2010_06_10/81ae7c24745211df9f6d002590008422
 /data/2010_06_10

 is resulting in my part files showing up in just */data/2010_06_10*
 (with no creation of the UUID directory like it did before)

 my desired result is to have the files

 from *event/2010_06_10/81ae7c24745211df9f6d002590008422

 *and end up in */data/2010_06_10/81ae7c24745211df9f6d002590008422*

 with the trailing directory creating itself like it does in the first
 scenario but not duplicating it as it doesweirdly.

 /*
 Joe Stein
 http://www.linkedin.com/in/charmalloc
 Twitter: @allthingshadoop
 */




-- 
/*
Joe Stein
http://www.linkedin.com/in/charmalloc
*/

how to decode the metadata file of a block

Is it possible ....!!!

Re: Is it possible ....!!! COOL!

Re: Is it possible ....!!! COOL!

Appending and seeking files while writing

Java run-time error while executing my application - unable to find the files on the HDFS

Re: Is it possible ....!!!

Help:how to read a xml file in hadoop framework

Re: Is it possible ....!!!

Re: Is it possible ....!!!

Re: just because you can, it doesn't mean you should....

Re: the same key in different reducers

Re: Delivery Status Notification (Failure)

copyToLocal

Re: Delivery Status Notification (Failure)

present at seajug?

Re: Delivery Status Notification (Failure)

Re: Delivery Status Notification (Failure)

Delivery Status Notification (Failure)

Re: copyToLocal

20 matches

Site Navigation

Mail list logo

Footer information