Hi Brock,

I'm not getting any errors.

I'm issuing the following command now:

hadoop jar hadoop-0.21.0/mapred/contrib/streaming/hadoop-0.21.0-streaming.jar -input SRR014475.lite.nodoublequotewithendsnocommas.fastq.received.1-read-per-line-format.10000 -output SRR014475.lite.nodoublequotewithendsnocommas.fastq.received.1-read-per-line-format.10000.aligned -mapper '/root/bowtiestreaming.sh' -jobconf mapred.reduce.tasks=0 -file bowtiestreaming.sh

The only error I get using "cat hadoop-0.21.0/logs/* |grep Exception" is:
org.apache.hadoop.fs.ChecksumException: Checksum error: file:/root/hadoop-0.21.0/logs/history/job_201112060917_0002_root at 2620416 2011-12-06 11:14:34,515 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -13816: No such process 2011-12-06 11:14:43,039 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -13862: No such process 2011-12-06 11:14:46,282 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -13891: No such process 2011-12-06 11:14:49,841 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -13978: No such process


best Regards,

Romeo

On 12/06/2011 10:49 AM, Brock Noland wrote:
Does you job end with an error?

I am guessing what you want is:

-mapper bowtiestreaming.sh -file '/root/bowtiestreaming.sh'

First option says use your script as a mapper and second says ship
your script as part of the job.

Brock

On Tue, Dec 6, 2011 at 4:59 PM, Romeo Kienzler<ro...@ormium.de>  wrote:
Hi,

I've got the following setup for NGS read alignment:


A script accepting data from stdin/out:
------------------------------------------------------------
cat /root/bowtiestreaming.sh
cd /home/streamsadmin/crossbow-1.1.2/bin/linux32/
/home/streamsadmin/crossbow-1.1.2/bin/linux32/bowtie -m 1 -q e_coli --12 -
2>  /root/bowtie.log



A file copied to HDFS:
------------------------------------------------------------
hadoop fs -put
SRR014475.lite.nodoublequotewithendsnocommas.fastq.received.1-read-per-line-format.10000
SRR014475.lite.nodoublequotewithendsnocommas.fastq.received.1-read-per-line-format.10000

A streaming job invoked with only the mapper:
------------------------------------------------------------
hadoop jar
hadoop-0.21.0/mapred/contrib/streaming/hadoop-0.21.0-streaming.jar -input
SRR014475.lite.nodoublequotewithendsnocommas.fastq.received.1-read-per-line-format.10000
-output
SRR014475.lite.nodoublequotewithendsnocommas.fastq.received.1-read-per-line-format.10000.aligned
-mapper '/root/bowtiestreaming.sh' -jobconf mapred.reduce.tasks=0

The file cannot be found even it is displayed:
------------------------------------------------------------
hadoop fs -cat
/user/root/SRR014475.lite.nodoublequotewithendsnocommas.fastq.received.1-read-per-line-format.10000.aligned
11/12/06 09:07:47 INFO security.Groups: Group mapping
impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
cacheTimeout=300000
11/12/06 09:07:48 WARN conf.Configuration: mapred.task.id is deprecated.
Instead, use mapreduce.task.attempt.id
cat: File does not exist:
/user/root/SRR014475.lite.nodoublequotewithendsnocommas.fastq.received.1-read-per-line-format.10000.aligned


He file looks like this (tab seperated):
head
SRR014475.lite.nodoublequotewithendsnocommas.fastq.received.1-read-per-line-format.10000
@SRR014475.1 :1:1:108:111 length=36     GAGTTTTACGTCGTCCTAAAACAGTACATAAAAATA
    I3IIIII+I(%BH43%III7I(5IIIIIII*<&II+
@SRR014475.2 :1:1:112:26 length=36      GNNNNNNTTCCCTTTTCAACTTCCAAATCACCTAAC
    I!!!!!!II=I<IIII@II5II)/$;%+*/&%%#&#
@SRR014475.3 :1:1:101:937 length=36     GAAGATCCGGTACAACAAAACCTGATGTAAATGGTA
    IIIIIIIIIIIIIIIIIAIIIIIIAII%I<IIII0G
@SRR014475.4 :1:1:124:64 length=36      GAACACATAGAACAACAGGATTCGCCAGAACACCTG
    IIIIIIIIIIIIIII><CI+@5+)'(-'&;&%$;+;
@SRR014475.5 :1:1:108:897 length=36     GGAAGAGATGAAGTGGGTCGTTGTGGTGTGTTTGTT
    I0I:I'IIII+IG3II46II0>C@=III()+:+2&$
@SRR014475.6 :1:1:106:14 length=36      GNNNNNNNNNNNNNNNTNTAGCATTAAGTAATTGGT
    I!!!!!!!!!!!!!!!I!I6I*+III:%IB0+I.%?
@SRR014475.7 :1:1:118:934 length=36     GGTTACTACTCTGCGACTCCTCGCAGAAGAGACGCT
    III0%%)&%I.I&I;III.(I@E&2>*'+1;;#;&'
@SRR014475.8 :1:1:123:8 length=36       GNNNNNNNNNNNNNNNNNNNNNNNNNNNTNNNNTNN
    I!!!!!!!!!!!!!!!!!!!!!!!!!!!$!!!!(!!
@SRR014475.9 :1:1:118:88 length=36      GGAAACAAAATGGCGCGCTACCAGGTAACGCGCCAC
    IIIIIIIIIIIIIIIGIAA4;1+16*;*+)'$%#$%
@SRR014475.10 :1:1:92:122 length=36     ATTTGCTGCCAATGGCGAGATTAAAAACGAATAATA
    IIIIIIIIIIIIIICII;CGIDI?%$I:%6)C*;#;


and the result like this:

cat
SRR014475.lite.nodoublequotewithendsnocommas.fastq.received.1-read-per-line-format.10000
|./bowtiestreaming.sh |head
@SRR014475.3 :1:1:101:937 length=36     +
gi|110640213|ref|NC_008253.1|   3393863 GAAGATCCGGTACAACAAAACCTGATGTAAATGGTA
    IIIIIIIIIIIIIIIIIAIIIIIIAII%I<IIII0G  0       7:T>C,27:G>T
@SRR014475.4 :1:1:124:64 length=36      +
gi|110640213|ref|NC_008253.1|   2288633 GAACACATAGAACAACAGGATTCGCCAGAACACCTG
    IIIIIIIIIIIIIII><CI+@5+)'(-'&;&%$;+;  0       30:T>C
@SRR014475.5 :1:1:108:897 length=36     +
gi|110640213|ref|NC_008253.1|   4389356 GGAAGAGATGAAGTGGGTCGTTGTGGTGTGTTTGTT
    I0I:I'IIII+IG3II46II0>C@=III()+:+2&$  0
5:C>A,28:G>T,29:C>G,30:A>T,34:C>T
@SRR014475.9 :1:1:118:88 length=36      -
gi|110640213|ref|NC_008253.1|   3598410 GTGGCGCGTTACCTGGTAGCGCGCCATTTTGTTTCC
    %$#%$')+*;*61+1;4AAIGIIIIIIIIIIIIIII  0
@SRR014475.15 :1:1:87:967 length=36     +
gi|110640213|ref|NC_008253.1|   4474247 GACTACACGATCGCCTGCCTTAATATTCTTTACACC
    IIIIIIIIIIIIA27II7CIII*I5I+FIIII?II'  0       6:G>A,26:G>T
@SRR014475.20 :1:1:108:121 length=36    -
gi|110640213|ref|NC_008253.1|   37761   AAAAAATGCATATTGTTTTAGAGTGTGATTATTAGC
    I<D4II'2I<IIC/;B?FIIIIIIIIIIIIIIIIII  0       12:C>T
@SRR014475.23 :1:1:75:54 length=36      +
gi|110640213|ref|NC_008253.1|   2465453 GGTTTCTTTCTGCGCAGATGCCAGACGGTCTTTATA
    IIIIIIIIIIIICII<III;';29=9I.4%EE2)*'  0
@SRR014475.24 :1:1:89:904 length=36     -
gi|110640213|ref|NC_008253.1|   3216193 ATTAGTGTTAAGATTTCTATATTGTTGTTTTAGGCC
    #%);%;$EI-;$%8%&I%I/+IIIIIIIIIIIIIII  0
18:C>T,21:G>T,30:C>T,31:T>G,34:A>T
@SRR014475.27 :1:1:74:887 length=36     -
gi|110640213|ref|NC_008253.1|   540567  AAACGTGGCGTTTCAGGGATCGTTTGCCTGCATTAC
    *&(%9%0F3.@4;&?4I3I6%:9AI0HIIIIIIIII  0       34:C>A,35:C>A
@SRR014475.30 :1:1:123:73 length=36     +
gi|110640213|ref|NC_008253.1|   3391697 AAAAGATTGCGACTGACGGCGCAAATGCCCTCCGTT
    IIIIIIIIICI:II3*<4.*'+%'&)&$;+;%;%;;  0       30:C>T,34:G>T


Any ideas?

best Regards,

Romeo


-------------
Romeo Kienzler
r o m e o @ o r m i u m . d e


Reply via email to