hi, all:
Recently, we have encountered a problem while using spark sql to write orc
table, which is related to https://issues.apache.org/jira/browse/HIVE-10790.
In order to fix this problem we decided to patched the PR to the hive
branch which spark1.5 rely on.
We pull the hive branch(
Considering that Pyspark is a very tightly integrated library rather than
an RPC integration, I doubt a Go integration would come any time soon.
On Fri, May 13, 2016 at 10:22 PM Sourav Chakraborty
wrote:
> Folks,
> Was curious to find out if anybody ever
I built this recently using the accepted answer on this SO page:
http://stackoverflow.com/questions/26741714/how-does-the-pyspark-mappartitions-function-work/26745371
-sujit
On Sat, May 14, 2016 at 7:00 AM, Mathieu Longtin
wrote:
> From memory:
> def
>From memory:
def processor(iterator):
for item in iterator:
newitem = do_whatever(item)
yield newitem
newdata = data.mapPartition(processor)
Basically, your function takes an iterator as an argument, and must either
be an iterator or return one.
On Sat, May 14, 2016 at 12:39 AM Abi
Hi,
I'm trying to run a simple spark streaming application with File Streaming
and its working properly but when I try to monitor the number of events in
the Streaming Ui it shows that as 0.Is this a issue and are there any plans
to fix this.
Regards,
SJ