Hi Brock,
I'm not getting any errors.
I'm issuing the following command now:
hadoop jar
hadoop-0.21.0/mapred/contrib/streaming/hadoop-0.21.0-streaming.jar
-input
SRR014475.lite.nodoublequotewithendsnocommas.fastq.received.1-read-per-line-format.10000
-output
SRR014475.lite.nodoublequotewithendsnocommas.fastq.received.1-read-per-line-format.10000.aligned
-mapper '/root/bowtiestreaming.sh' -jobconf mapred.reduce.tasks=0 -file
bowtiestreaming.sh
The only error I get using "cat hadoop-0.21.0/logs/* |grep Exception" is:
org.apache.hadoop.fs.ChecksumException: Checksum error:
file:/root/hadoop-0.21.0/logs/history/job_201112060917_0002_root at 2620416
2011-12-06 11:14:34,515 WARN
org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell
command org.apache.hadoop.util.Shell$ExitCodeException: kill -13816: No
such process
2011-12-06 11:14:43,039 WARN
org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell
command org.apache.hadoop.util.Shell$ExitCodeException: kill -13862: No
such process
2011-12-06 11:14:46,282 WARN
org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell
command org.apache.hadoop.util.Shell$ExitCodeException: kill -13891: No
such process
2011-12-06 11:14:49,841 WARN
org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell
command org.apache.hadoop.util.Shell$ExitCodeException: kill -13978: No
such process
best Regards,
Romeo
On 12/06/2011 10:49 AM, Brock Noland wrote:
Does you job end with an error?
I am guessing what you want is:
-mapper bowtiestreaming.sh -file '/root/bowtiestreaming.sh'
First option says use your script as a mapper and second says ship
your script as part of the job.
Brock
On Tue, Dec 6, 2011 at 4:59 PM, Romeo Kienzler<ro...@ormium.de> wrote:
Hi,
I've got the following setup for NGS read alignment:
A script accepting data from stdin/out:
------------------------------------------------------------
cat /root/bowtiestreaming.sh
cd /home/streamsadmin/crossbow-1.1.2/bin/linux32/
/home/streamsadmin/crossbow-1.1.2/bin/linux32/bowtie -m 1 -q e_coli --12 -
2> /root/bowtie.log
A file copied to HDFS:
------------------------------------------------------------
hadoop fs -put
SRR014475.lite.nodoublequotewithendsnocommas.fastq.received.1-read-per-line-format.10000
SRR014475.lite.nodoublequotewithendsnocommas.fastq.received.1-read-per-line-format.10000
A streaming job invoked with only the mapper:
------------------------------------------------------------
hadoop jar
hadoop-0.21.0/mapred/contrib/streaming/hadoop-0.21.0-streaming.jar -input
SRR014475.lite.nodoublequotewithendsnocommas.fastq.received.1-read-per-line-format.10000
-output
SRR014475.lite.nodoublequotewithendsnocommas.fastq.received.1-read-per-line-format.10000.aligned
-mapper '/root/bowtiestreaming.sh' -jobconf mapred.reduce.tasks=0
The file cannot be found even it is displayed:
------------------------------------------------------------
hadoop fs -cat
/user/root/SRR014475.lite.nodoublequotewithendsnocommas.fastq.received.1-read-per-line-format.10000.aligned
11/12/06 09:07:47 INFO security.Groups: Group mapping
impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
cacheTimeout=300000
11/12/06 09:07:48 WARN conf.Configuration: mapred.task.id is deprecated.
Instead, use mapreduce.task.attempt.id
cat: File does not exist:
/user/root/SRR014475.lite.nodoublequotewithendsnocommas.fastq.received.1-read-per-line-format.10000.aligned
He file looks like this (tab seperated):
head
SRR014475.lite.nodoublequotewithendsnocommas.fastq.received.1-read-per-line-format.10000
@SRR014475.1 :1:1:108:111 length=36 GAGTTTTACGTCGTCCTAAAACAGTACATAAAAATA
I3IIIII+I(%BH43%III7I(5IIIIIII*<&II+
@SRR014475.2 :1:1:112:26 length=36 GNNNNNNTTCCCTTTTCAACTTCCAAATCACCTAAC
I!!!!!!II=I<IIII@II5II)/$;%+*/&%%#&#
@SRR014475.3 :1:1:101:937 length=36 GAAGATCCGGTACAACAAAACCTGATGTAAATGGTA
IIIIIIIIIIIIIIIIIAIIIIIIAII%I<IIII0G
@SRR014475.4 :1:1:124:64 length=36 GAACACATAGAACAACAGGATTCGCCAGAACACCTG
IIIIIIIIIIIIIII><CI+@5+)'(-'&;&%$;+;
@SRR014475.5 :1:1:108:897 length=36 GGAAGAGATGAAGTGGGTCGTTGTGGTGTGTTTGTT
I0I:I'IIII+IG3II46II0>C@=III()+:+2&$
@SRR014475.6 :1:1:106:14 length=36 GNNNNNNNNNNNNNNNTNTAGCATTAAGTAATTGGT
I!!!!!!!!!!!!!!!I!I6I*+III:%IB0+I.%?
@SRR014475.7 :1:1:118:934 length=36 GGTTACTACTCTGCGACTCCTCGCAGAAGAGACGCT
III0%%)&%I.I&I;III.(I@E&2>*'+1;;#;&'
@SRR014475.8 :1:1:123:8 length=36 GNNNNNNNNNNNNNNNNNNNNNNNNNNNTNNNNTNN
I!!!!!!!!!!!!!!!!!!!!!!!!!!!$!!!!(!!
@SRR014475.9 :1:1:118:88 length=36 GGAAACAAAATGGCGCGCTACCAGGTAACGCGCCAC
IIIIIIIIIIIIIIIGIAA4;1+16*;*+)'$%#$%
@SRR014475.10 :1:1:92:122 length=36 ATTTGCTGCCAATGGCGAGATTAAAAACGAATAATA
IIIIIIIIIIIIIICII;CGIDI?%$I:%6)C*;#;
and the result like this:
cat
SRR014475.lite.nodoublequotewithendsnocommas.fastq.received.1-read-per-line-format.10000
|./bowtiestreaming.sh |head
@SRR014475.3 :1:1:101:937 length=36 +
gi|110640213|ref|NC_008253.1| 3393863 GAAGATCCGGTACAACAAAACCTGATGTAAATGGTA
IIIIIIIIIIIIIIIIIAIIIIIIAII%I<IIII0G 0 7:T>C,27:G>T
@SRR014475.4 :1:1:124:64 length=36 +
gi|110640213|ref|NC_008253.1| 2288633 GAACACATAGAACAACAGGATTCGCCAGAACACCTG
IIIIIIIIIIIIIII><CI+@5+)'(-'&;&%$;+; 0 30:T>C
@SRR014475.5 :1:1:108:897 length=36 +
gi|110640213|ref|NC_008253.1| 4389356 GGAAGAGATGAAGTGGGTCGTTGTGGTGTGTTTGTT
I0I:I'IIII+IG3II46II0>C@=III()+:+2&$ 0
5:C>A,28:G>T,29:C>G,30:A>T,34:C>T
@SRR014475.9 :1:1:118:88 length=36 -
gi|110640213|ref|NC_008253.1| 3598410 GTGGCGCGTTACCTGGTAGCGCGCCATTTTGTTTCC
%$#%$')+*;*61+1;4AAIGIIIIIIIIIIIIIII 0
@SRR014475.15 :1:1:87:967 length=36 +
gi|110640213|ref|NC_008253.1| 4474247 GACTACACGATCGCCTGCCTTAATATTCTTTACACC
IIIIIIIIIIIIA27II7CIII*I5I+FIIII?II' 0 6:G>A,26:G>T
@SRR014475.20 :1:1:108:121 length=36 -
gi|110640213|ref|NC_008253.1| 37761 AAAAAATGCATATTGTTTTAGAGTGTGATTATTAGC
I<D4II'2I<IIC/;B?FIIIIIIIIIIIIIIIIII 0 12:C>T
@SRR014475.23 :1:1:75:54 length=36 +
gi|110640213|ref|NC_008253.1| 2465453 GGTTTCTTTCTGCGCAGATGCCAGACGGTCTTTATA
IIIIIIIIIIIICII<III;';29=9I.4%EE2)*' 0
@SRR014475.24 :1:1:89:904 length=36 -
gi|110640213|ref|NC_008253.1| 3216193 ATTAGTGTTAAGATTTCTATATTGTTGTTTTAGGCC
#%);%;$EI-;$%8%&I%I/+IIIIIIIIIIIIIII 0
18:C>T,21:G>T,30:C>T,31:T>G,34:A>T
@SRR014475.27 :1:1:74:887 length=36 -
gi|110640213|ref|NC_008253.1| 540567 AAACGTGGCGTTTCAGGGATCGTTTGCCTGCATTAC
*&(%9%0F3.@4;&?4I3I6%:9AI0HIIIIIIIII 0 34:C>A,35:C>A
@SRR014475.30 :1:1:123:73 length=36 +
gi|110640213|ref|NC_008253.1| 3391697 AAAAGATTGCGACTGACGGCGCAAATGCCCTCCGTT
IIIIIIIIICI:II3*<4.*'+%'&)&$;+;%;%;; 0 30:C>T,34:G>T
Any ideas?
best Regards,
Romeo
-------------
Romeo Kienzler
r o m e o @ o r m i u m . d e