tion it shouold be possible, but it seems that it does not work for me.
______
Od: Davies Liu
Komu:
Datum: 07.10.2014 17:38
Předmět: Re: Parsing one big multiple line .xml loaded in RDD using Python
CC: "u...@spark.incubator.apache.org"
Maybe
Maybe sc.wholeTextFile() is what you want, you can get the whole text
and parse it by yourself.
On Tue, Oct 7, 2014 at 1:06 AM, wrote:
> Hi,
>
> I have already unsucesfully asked quiet simmilar question at stackoverflow,
> particularly here:
> http://stackoverflow.com/questions/26202978/spark-an
Hi,
I have already unsucesfully asked quiet simmilar question at stackoverflow,
particularly here:
http://stackoverflow.com/questions/26202978/spark-and-python-trying-to-parse-wikipedia-using-gensim.
I've also unsucessfully tryied some workaround, but unsucessfuly, workaround
problem can be f