.
__
Od: Davies Liu dav...@databricks.com
Komu: jan.zi...@centrum.cz
Datum: 07.10.2014 17:38
Předmět: Re: Parsing one big multiple line .xml loaded in RDD using Python
CC: u...@spark.incubator.apache.org
Maybe sc.wholeTextFile() is what you want, you
Hi,
I have already unsucesfully asked quiet simmilar question at stackoverflow,
particularly here:
http://stackoverflow.com/questions/26202978/spark-and-python-trying-to-parse-wikipedia-using-gensim.
I've also unsucessfully tryied some workaround, but unsucessfuly, workaround
problem can be
Maybe sc.wholeTextFile() is what you want, you can get the whole text
and parse it by yourself.
On Tue, Oct 7, 2014 at 1:06 AM, jan.zi...@centrum.cz wrote:
Hi,
I have already unsucesfully asked quiet simmilar question at stackoverflow,
particularly here: