Cool - thanks for the confirmation and link, Joey, very helpful.
-Original Message-
From: Joey Echeverria [mailto:j...@cloudera.com]
Sent: 14 March 2012 19:03
To: common-user@hadoop.apache.org
Subject: Re: decompressing bzip2 data with a custom InputFormat
Yes you have to deal with
To: common-user@hadoop.apache.org
> Subject: decompressing bzip2 data with a custom InputFormat
>
> Hi,
>
> I'm setting up a map-only job that reads large bzip2-compressed data files,
> parses the XML and writes out the same data in plain text format. My XML
>
done?
Thanks!
Tony
-Original Message-
From: Tony Burton [mailto:tbur...@sportingindex.com]
Sent: 12 March 2012 18:05
To: common-user@hadoop.apache.org
Subject: decompressing bzip2 data with a custom InputFormat
Hi,
I'm setting up a map-only job that reads large bzip2-compressed
Hi,
I'm setting up a map-only job that reads large bzip2-compressed data files,
parses the XML and writes out the same data in plain text format. My XML
InputFormat extends TextInputFormat and has a RecordReader based upon the one
you can see at http://xmlandhadoop.blogspot.com/ (my version of
ackoverflow.com/questions/7692994/custom-inputformat-with-hive.
If there's a resource someone can point me to that'd also be great.
Many thanks in advance,
Mike
Hi,
I am quite new to hadoop. I write my own StreamFastaInputFormat an
StreamFastaRecordReader in $
hadoopbase/src/contrib/streaming/src/java/org/apache/hadoop/streaming/
I run "$ant" under the directory $hadoopbase/src/contrib/streaming/
using the default build.xml. However it failed due to the
The last I heard, there were some discussions of instead creating solr index
using hadoop mapreduce rather than pushing solr index into hdfs and so on.
SOLR-1045 ad SOLR-1301 can provide you more info.
Cheers,
/R
On 2/24/10 4:23 PM, "Rakhi Khatwani" wrote:
Hi,
Has anyone tried creatin
Hi,
Has anyone tried creating customInputFormat which reads from
solrIndex for processing using mapreduce??? is it possible doin tht?? and
how?
Regards,
Raakhi
Hi All!
I am implementing a custom InputFormat.
Its custom RecordReader uses LineRecordReader.LineReader inside.
In some cases its read() method returns 0, i.e. reads 0 bytes. This
happen also in unit test where it reads form a regular file on UNIX
filesystem.
What does it mean and how should I
text.write(out);
> }
>
>}
>
>
>
> -Original Message-
> From: valentina kroshilina [mailto:kroshil...@gmail.com]
> Sent: 2010年1月8日 12:05
> To: common-user@hadoop.apache.org
> Subject: custom InputFormat
>
> I have LongWritab
che.org
Subject: custom InputFormat
I have LongWritable, IncidentWritable key-value pair as output from one
job, that I want to read as input in my second job, where IncidentWritable
is custom Writable(see code below).
How do I read IncidentWritable in my custom Reader? I don't know how
I have LongWritable, IncidentWritable key-value pair as output from one
job, that I want to read as input in my second job, where IncidentWritable
is custom Writable(see code below).
How do I read IncidentWritable in my custom Reader? I don't know how to
convert byte[] to IncidentWritable.
Code
FYI, basing off of Antonio's great work, I finally got around to making this
InputFormat tonight: see http://github.com/kevinweil/IntegerListInputFormat.
If people are interested, I'm happy to format it and license it
appropriately and commit it to core hadoop. Let me know, otherwise I'll
just le
Antonio,
If you're interested in open sourcing this, I'd be interested in
using/helping. We do something internally that's similar to this, and I've
been meaning to write a general-purpose version for a while. Would love to
see what you've done and contribute back any changes we make. Github?
Philip,
that was quick and precise. I learned something today. Thank you!
Antonio
On Fri, Dec 11, 2009 at 8:20 PM, Philip Zeyliger wrote:
> Hi Antonio,
>
> Check out MapTask.java. When your job gets instantiated on the cluster, an
> InputSplit object is created for the task, using reflection.
Hi Antonio,
Check out MapTask.java. When your job gets instantiated on the cluster, an
InputSplit object is created for the task, using reflection. An InputSplit
is a Writable, and, like all writables, it gets created with an empty
constructor and initialized with readFields().
If you implement
Hi,
I've been trying to code a pretty simple InputFormat. The idea is this: I
have an array of numbers (say, the range [0-5000]) and I want each mapper to
receive a split of size 500 i.e. 500 LongWritable's.
this is an excerpt from the class extending InputSplit:
public class myInputSplit extend
17 matches
Mail list logo