Re: Request for Help

2014-08-26 Thread Akhil Das
Hi

Not sure this is the right way of doing it, but if you can create a
PairRDDFunction from that RDD then you can use the following piece of code
to access the filenames from the RDD.


PairRDDFunctionsK, V ds = .;

//getting the name and path for the
file name   for(int 
i=0;ids.values().getPartitions().length;i++)
{   UnionPartition upp = (UnionPartition)
ds.values().getPartitions()[i]; 
NewHadoopPartition npp =
(NewHadoopPartition) upp.split();   
System.out.println(File 
+ npp.serializableHadoopSplit().value().toString());

}



Thanks
Best Regards


On Tue, Aug 26, 2014 at 1:25 AM, yh18190 yh18...@gmail.com wrote:

 Hi Guys,

 I just want to know whether their is any way to determine which file is
 being handled by spark from a group of files input inside a
 directory.Suppose I have 1000 files which are given as input,I want to
 determine which file is being handled currently by spark program so that if
 any error creeps in at any point of time we can easily determine that
 particular file as faulty one.

 Please let me know your thoughts.



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Request-for-Help-tp12776.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




Request for help in writing to Textfile

2014-08-25 Thread yh18190
Hi Guys,

I am currently playing with huge data.I have an RDD which returns
RDD[List[(tuples)]].I need only the tuples to be written to textfile output
using saveAsTextFile function.
example:val mod=modify.saveASTextFile()  returns 

List((20140813,4,141127,3,HYPHLJLU,HY,KNGHWEB,USD,144.00,662.40,KY1),
(20140813,4,141127,3,HYPHLJLU,HY,DBLHWEB,USD,144.00,662.40,KY1))
List((20140813,4,141127,3,HYPHLJLU,HY,KNGHWEB,USD,144.00,662.40,KY1),
(20140813,4,141127,3,HYPHLJLU,HY,DBLHWEB,USD,144.00,662.40,KY1)

I need following output with only tuple values in a textfile.
20140813,4,141127,3,HYPHLJLU,HY,KNGHWEB,USD,144.00,662.40,KY1
20140813,4,141127,3,HYPHLJLU,HY,DBLHWEB,USD,144.00,662.40,KY1


Please let me know if anybody has anyidea regarding this without using
collect() function...Please help me



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Request-for-help-in-writing-to-Textfile-tp12744.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Request for Help

2014-08-25 Thread yh18190
Hi Guys,

I just want to know whether their is any way to determine which file is
being handled by spark from a group of files input inside a
directory.Suppose I have 1000 files which are given as input,I want to
determine which file is being handled currently by spark program so that if
any error creeps in at any point of time we can easily determine that
particular file as faulty one.

Please let me know your thoughts.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Request-for-Help-tp12776.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



RE: Request for help in writing to Textfile

2014-08-25 Thread Liu, Raymond
You can try to manipulate the string you want to output before saveAsTextFile, 
something like

modify. flatMap(x=x).map{x=
 val s=x.toString
 s.subSequence(1,s.length-1)
   }

Should have more optimized way.

Best Regards,
Raymond Liu


-Original Message-
From: yh18190 [mailto:yh18...@gmail.com] 
Sent: Monday, August 25, 2014 9:57 PM
To: u...@spark.incubator.apache.org
Subject: Request for help in writing to Textfile

Hi Guys,

I am currently playing with huge data.I have an RDD which returns 
RDD[List[(tuples)]].I need only the tuples to be written to textfile output 
using saveAsTextFile function.
example:val mod=modify.saveASTextFile()  returns 

List((20140813,4,141127,3,HYPHLJLU,HY,KNGHWEB,USD,144.00,662.40,KY1),
(20140813,4,141127,3,HYPHLJLU,HY,DBLHWEB,USD,144.00,662.40,KY1))
List((20140813,4,141127,3,HYPHLJLU,HY,KNGHWEB,USD,144.00,662.40,KY1),
(20140813,4,141127,3,HYPHLJLU,HY,DBLHWEB,USD,144.00,662.40,KY1)

I need following output with only tuple values in a textfile.
20140813,4,141127,3,HYPHLJLU,HY,KNGHWEB,USD,144.00,662.40,KY1
20140813,4,141127,3,HYPHLJLU,HY,DBLHWEB,USD,144.00,662.40,KY1


Please let me know if anybody has anyidea regarding this without using
collect() function...Please help me



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Request-for-help-in-writing-to-Textfile-tp12744.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional 
commands, e-mail: user-h...@spark.apache.org


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org