Hi, The example that I provided is not very clear. And I add a more clear
example in jira.
Thanks
Cheers
Gen
On Wed, Feb 22, 2017 at 3:47 PM, gen tang <gen.tan...@gmail.com> wrote:
> Hi Kazuaki Ishizaki
>
> Thanks a lot for your help. It works. However, a more strange bug appea
uot;overwrite").parquet(dir)
> spark.catalog.refreshByPath(dir) // insert a NEW statement
> val df1 = spark.read.parquet(dir)
> df1.count // output 1000 which is correct, in fact other operation expect
> df1.filter("id>10") return correct result.
> f(df1).count // out
Hi All,
I might find a related issue on jira:
https://issues.apache.org/jira/browse/SPARK-15678
This issue is closed, may be we should reopen it.
Thanks
Cheers
Gen
On Wed, Feb 22, 2017 at 1:57 PM, gen tang <gen.tan...@gmail.com> wrote:
> Hi All,
>
> I found a strange bug w
Hi All,
I found a strange bug which is related with reading data from a updated
path and cache operation.
Please consider the following code:
import org.apache.spark.sql.DataFrame
def f(data: DataFrame): DataFrame = {
val df = data.filter("id>10")
df.cache
df.count
df
}
-- Forwarded message --
From: gen tang <gen.tan...@gmail.com>
Date: Fri, Nov 6, 2015 at 12:14 AM
Subject: Re: dataframe slow down with tungsten turn on
To: "Cheng, Hao" <hao.ch...@intel.com>
Hi,
My application is as follows:
1. create dataframe from h
Hi,
Recently, I use spark sql to do join on non-equality condition, condition1
or condition2 for example.
Spark will use broadcastNestedLoopJoin to do this. Assume that one of
dataframe(df1) is not created from hive table nor local collection and the
other one is created from hivetable(df2). For
not sure how you created the df1 instance, but we’d better to
reflect the real size for the statistics of it, and let the framework
decide what to do, hopefully Spark Sql can support the non-equal join for
large tables in the next release.
Hao
*From:* gen tang [mailto:gen.tan
Hi,
I have a stupid question:
Is it possible to use spark on Teradata data warehouse, please? I read some
news on internet which say yes. However, I didn't find any example about
this issue
Thanks in advance.
Cheers
Gen