Not so sure your intention, but something like SELECT sum(val1), sum(val2)
FROM table GROUP BY src, dest ?
-Original Message-
From: Shailesh Birari [mailto:sbirar...@gmail.com]
Sent: Friday, March 20, 2015 9:31 AM
To: user@spark.apache.org
Subject: Spark SQL Self join with agreegate
Can you use the Varchar or String instead? Currently, Spark SQL will convert
the varchar into string type internally(without max length limitation).
However, char type is not supported yet.
-Original Message-
From: A.M.Chan [mailto:kaka_1...@163.com]
Sent: Friday, March 20, 2015 9:56
Seems the elasticsearch-hadoop project was built with an old version of Spark,
and then you upgraded the Spark version in execution env, as I know the
StructField changed the definition in Spark 1.2, can you confirm the version
problem first?
From: Todd Nist [mailto:tsind...@gmail.com]
Sent:
Or you need to specify the jars either in configuration or
bin/spark-sql --jars mysql-connector-xx.jar
From: fightf...@163.com [mailto:fightf...@163.com]
Sent: Monday, March 16, 2015 2:04 PM
To: sandeep vura; Ted Yu
Cc: user
Subject: Re: Re: Unable to instantiate
It doesn’t take effect if just putting jar files under the lib-managed/jars
folder, you need to put that under class path explicitly.
From: sandeep vura [mailto:sandeepv...@gmail.com]
Sent: Monday, March 16, 2015 2:21 PM
To: Cheng, Hao
Cc: fightf...@163.com; Ted Yu; user
Subject: Re: Re: Unable
check the configuration file of
$SPARK_HOME/conf/spark-xxx.conf ?
Cheng Hao
From: Grandl Robert [mailto:rgra...@yahoo.com.INVALID]
Sent: Thursday, March 12, 2015 5:07 AM
To: user@spark.apache.org
Subject: Spark SQL using Hive metastore
Hi guys,
I am a newbie in running Spark SQL / Spark. My goal
You can add the additional jar when submitting your job, something like:
./bin/spark-submit --jars xx.jar …
More options can be listed by just typing ./bin/spark-submit
From: shahab [mailto:shahab.mok...@gmail.com]
Sent: Tuesday, March 10, 2015 8:48 PM
To: user@spark.apache.org
Subject: Does
Currently, Spark SQL doesn’t provide interface for developing the custom UDTF,
but it can work seamless with Hive UDTF.
I am working on the UDTF refactoring for Spark SQL, hopefully will provide an
Hive independent UDTF soon after that.
From: shahab [mailto:shahab.mok...@gmail.com]
Sent:
/pull/3247
From: shahab [mailto:shahab.mok...@gmail.com]
Sent: Wednesday, March 11, 2015 1:44 AM
To: Cheng, Hao
Cc: user@spark.apache.org
Subject: Re: Registering custom UDAFs with HiveConetxt in SparkSQL, how?
Thanks Hao,
But my question concerns UDAF (user defined aggregation function ) not UDTF
I am not so sure if Hive supports change the metastore after initialized, I
guess not. Spark SQL totally rely on Hive Metastore in HiveContext, probably
that's why it doesn't work as expected for Q1.
BTW, in most of cases, people configure the metastore settings in
hive-site.xml, and will not
I am not so sure if Hive supports change the metastore after initialized, I
guess not. Spark SQL totally rely on Hive Metastore in HiveContext, probably
that's why it doesn't work as expected for Q1.
BTW, in most of cases, people configure the metastore settings in
hive-site.xml, and will not
Intel has a prototype for doing this, SaiSai and Jason are the authors.
Probably you can ask them for some materials.
From: Mohit Anchlia [mailto:mohitanch...@gmail.com]
Sent: Wednesday, March 11, 2015 8:12 AM
To: user@spark.apache.org
Subject: SQL with Spark Streaming
Does Spark Streaming also
[
https://issues.apache.org/jira/browse/SPARK-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated SPARK-5817:
-
Description:
{code}
createQueryTest(Specify the udtf output, select d from (select
explode(array(1,1)) d
[
https://issues.apache.org/jira/browse/SPARK-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated SPARK-5817:
-
Description:
{code}
createQueryTest(Specify the udtf output, select d from (select
explode(array(key,1
[
https://issues.apache.org/jira/browse/SPARK-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated SPARK-5817:
-
Description:
createQueryTest(Specify the udtf output,
select d from (select explode(array(key,1)) d
[
https://issues.apache.org/jira/browse/SPARK-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated SPARK-5817:
-
Description:
{code}
createQueryTest(Specify the udtf output, select d from (select
explode(array(1,1)) d
Can you query upon Hive? Let's confirm if it's a bug of SparkSQL in your PHP
code first.
-Original Message-
From: fanooos [mailto:dev.fano...@gmail.com]
Sent: Thursday, March 5, 2015 4:57 PM
To: user@spark.apache.org
Subject: Connection PHP application to Spark Sql thrift server
We
I’ve tried with latest code, seems it works, which version are you using Shahab?
From: yana [mailto:yana.kadiy...@gmail.com]
Sent: Wednesday, March 4, 2015 8:47 PM
To: shahab; user@spark.apache.org
Subject: RE: Does SparkSQL support . having count (fieldname) in SQL
statement?
I think the
[
https://issues.apache.org/jira/browse/SPARK-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14348293#comment-14348293
]
Cheng Hao edited comment on SPARK-5791 at 3/5/15 7:08 AM:
--
I
[
https://issues.apache.org/jira/browse/SPARK-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14348293#comment-14348293
]
Cheng Hao edited comment on SPARK-5791 at 3/5/15 7:07 AM:
--
I
[
https://issues.apache.org/jira/browse/SPARK-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14348293#comment-14348293
]
Cheng Hao commented on SPARK-5791:
--
I think this is a typical case that we need
Can you provide the detailed failure call stack?
From: shahab [mailto:shahab.mok...@gmail.com]
Sent: Tuesday, March 3, 2015 3:52 PM
To: user@spark.apache.org
Subject: Supporting Hive features in Spark SQL Thrift JDBC server
Hi,
According to Spark SQL documentation, Spark SQL supports the
Using where('age =10 'age =4) instead.
-Original Message-
From: Guillermo Ortiz [mailto:konstt2...@gmail.com]
Sent: Tuesday, March 3, 2015 5:14 PM
To: user
Subject: SparkSQL, executing an OR
I'm trying to execute a query with Spark.
(Example from the Spark Documentation)
val teenagers
Hive UDF are only applicable for HiveContext and its subclass instance, is the
CassandraAwareSQLContext a direct sub class of HiveContext or SQLContext?
From: shahab [mailto:shahab.mok...@gmail.com]
Sent: Tuesday, March 3, 2015 5:10 PM
To: Cheng, Hao
Cc: user@spark.apache.org
Subject: Re
Using the SchemaRDD / DataFrame API via HiveContext
Assume you're using the latest code, something probably like:
val hc = new HiveContext(sc)
import hc.implicits._
existedRdd.toDF().insertInto(hivetable)
or
existedRdd.toDF().registerTempTable(mydata)
hc.sql(insert into hivetable as select xxx
As the call stack shows, the mongodb connector is not compatible with the Spark
SQL Data Source interface. The latest Data Source API is changed since 1.2,
probably you need to confirm which spark version the MongoDB Connector build
against.
By the way, a well format call stack will be more
” while starting the spark shell.
From: Anusha Shamanur [mailto:anushas...@gmail.com]
Sent: Wednesday, March 4, 2015 5:07 AM
To: Cheng, Hao
Subject: Re: Spark SQL Thrift Server start exception :
java.lang.ClassNotFoundException:
org.datanucleus.api.jdo.JDOPersistenceManagerFactory
Hi,
I am getting
instance.
-Original Message-
From: Haopu Wang [mailto:hw...@qilinsoft.com]
Sent: Tuesday, March 3, 2015 7:56 AM
To: Cheng, Hao; user
Subject: RE: Is SQLContext thread-safe?
Thanks for the response.
Then I have another question: when will we want to create multiple SQLContext
instances
Copy those jars into the $SPARK_HOME/lib/
datanucleus-api-jdo-3.2.6.jar
datanucleus-core-3.2.10.jar
datanucleus-rdbms-3.2.9.jar
see https://github.com/apache/spark/blob/master/bin/compute-classpath.sh#L120
-Original Message-
From: fanooos [mailto:dev.fano...@gmail.com]
Sent: Tuesday,
I am not so sure how Spark SQL compiled in CDH, but if didn’t specify the
–Phive and –Phive-thriftserver flags during the build, most likely it will not
work if just by providing the Hive lib jars later on. For example, does the
HiveContext class exist in the assembly jar?
I am also quite
https://issues.apache.org/jira/browse/SPARK-2087
https://github.com/apache/spark/pull/4382
I am working on the prototype, but will be updated soon.
-Original Message-
From: Haopu Wang [mailto:hw...@qilinsoft.com]
Sent: Tuesday, March 3, 2015 8:32 AM
To: Cheng, Hao; user
Subject: RE
This is actually a quite open question, from my understanding, there're
probably ways to tune like:
*SQL Configurations like:
Configuration Key
Default Value
spark.sql.autoBroadcastJoinThreshold
10 * 1024 * 1024
spark.sql.defaultSizeInBytes
10 * 1024 * 1024 + 1
Yes it is thread safe, at least it's supposed to be.
-Original Message-
From: Haopu Wang [mailto:hw...@qilinsoft.com]
Sent: Monday, March 2, 2015 4:43 PM
To: user
Subject: Is SQLContext thread-safe?
Hi, is it safe to use the same SQLContext to do Select operations in different
threads
$.main(SparkSQLCLIDriver.scala:202)
at
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
Thanks,
Cheng Hao
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional
It works after adding the -Djline.terminal=jline.UnsupportedTerminal
-Original Message-
From: Cheng, Hao [mailto:hao.ch...@intel.com]
Sent: Saturday, February 28, 2015 10:24 AM
To: user@spark.apache.org
Subject: JLine hangs under Windows8
Hi, All
I was trying to run spark sql cli
How many reducers you set for Hive? With small data set, Hive will run in local
mode, which will set the reducer count always as 1.
From: Kannan Rajah [mailto:kra...@maprtech.com]
Sent: Thursday, February 26, 2015 3:02 AM
To: Cheng Lian
Cc: user@spark.apache.org
Subject: Re: Spark-SQL 1.2.0 sort
Cheng Hao created SPARK-6034:
Summary: DESCRIBE EXTENDED viewname is not supported for
HiveContext
Key: SPARK-6034
URL: https://issues.apache.org/jira/browse/SPARK-6034
Project: Spark
Issue
[
https://issues.apache.org/jira/browse/SPARK-5941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated SPARK-5941:
-
Summary: `def table` is not using the unresolved logical plan
`UnresolvedRelation` (was: `def table
Cheng Hao created SPARK-5941:
Summary: `def table` is not using the unresolved logical plan
Key: SPARK-5941
URL: https://issues.apache.org/jira/browse/SPARK-5941
Project: Spark
Issue Type: Bug
[
https://issues.apache.org/jira/browse/SPARK-5941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14332003#comment-14332003
]
Cheng Hao commented on SPARK-5941:
--
Eagerly resolving the table probably causes side
[
https://issues.apache.org/jira/browse/SPARK-5941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated SPARK-5941:
-
Summary: `def table` is not using the unresolved logical plan in
DataFrameImpl (was: `def table
Are you using the SQLContext? I think the HiveContext is recommended.
Cheng Hao
From: Wush Wu [mailto:w...@bridgewell.com]
Sent: Thursday, February 12, 2015 2:24 PM
To: u...@spark.incubator.apache.org
Subject: Extract hour from Timestamp in Spark SQL
Dear all,
I am new to Spark SQL and have
[
https://issues.apache.org/jira/browse/SPARK-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated SPARK-5817:
-
Description:
createQueryTest(insert table with generator with column name,
CREATE TABLE
Cheng Hao created SPARK-5817:
Summary: UDTF column names didn't set properly
Key: SPARK-5817
URL: https://issues.apache.org/jira/browse/SPARK-5817
Project: Spark
Issue Type: Bug
[
https://issues.apache.org/jira/browse/SPARK-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319400#comment-14319400
]
Cheng Hao commented on SPARK-5791:
--
Can you also attach the performance comparison result
Cheng Hao created SPARK-5706:
Summary: Support inference schema from a single json string
Key: SPARK-5706
URL: https://issues.apache.org/jira/browse/SPARK-5706
Project: Spark
Issue Type
Cheng Hao created SPARK-5709:
Summary: Add EXPLAIN support for DataFrame API for debugging
purpose
Key: SPARK-5709
URL: https://issues.apache.org/jira/browse/SPARK-5709
Project: Spark
Issue
Cheng Hao created SPARK-5683:
Summary: Improve the json serialization for DataFrame API
Key: SPARK-5683
URL: https://issues.apache.org/jira/browse/SPARK-5683
Project: Spark
Issue Type
Cheng Hao created SPARK-5550:
Summary: Custom UDF is case sensitive for HiveContext
Key: SPARK-5550
URL: https://issues.apache.org/jira/browse/SPARK-5550
Project: Spark
Issue Type: Bug
[
https://issues.apache.org/jira/browse/SPARK-5550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated SPARK-5550:
-
Description:
SQL in HiveContext, should be case insensitive, however, the following query
will fail
The root cause for this probably because the identical “exprId” of the
“AttributeReference” existed while do self-join with “temp table” (temp table =
resolved logical plan).
I will do the bug fixing and JIRA creation.
Cheng Hao
From: Michael Armbrust [mailto:mich...@databricks.com]
Sent
Cheng Hao created SPARK-5404:
Summary: Statistic of Logical Plan is too aggresive
Key: SPARK-5404
URL: https://issues.apache.org/jira/browse/SPARK-5404
Project: Spark
Issue Type: Bug
[
https://issues.apache.org/jira/browse/SPARK-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated SPARK-5213:
-
Summary: Pluggable SQL Parser Support (was: Support the SQL Parser
Registry)
Pluggable SQL Parser
[
https://issues.apache.org/jira/browse/SPARK-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated SPARK-5213:
-
Description:
Currently, the SQL Parser dialect is hard code in SQLContext, which is not easy
to extend
Cheng Hao created SPARK-5364:
Summary: HiveQL transform doesn't support the non output clause
Key: SPARK-5364
URL: https://issues.apache.org/jira/browse/SPARK-5364
Project: Spark
Issue Type: Bug
[
https://issues.apache.org/jira/browse/SPARK-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated SPARK-5213:
-
Description:
Currently, the SQL Parser dialect is hard code in SQLContext, which is not easy
to extend
It seems the netty jar works with an incompatible method signature. Can you
check if there different versions of netty jar in your classpath?
From: Walrus theCat [mailto:walrusthe...@gmail.com]
Sent: Sunday, January 18, 2015 3:37 PM
To: user@spark.apache.org
Subject: Re: SparkSQL 1.2.0 sources
Wow, glad to know that it works well, and sorry, the Jira is another issue,
which is not the same case here.
From: Bagmeet Behera [mailto:bagme...@gmail.com]
Sent: Saturday, January 17, 2015 12:47 AM
To: Cheng, Hao
Subject: Re: using hiveContext to select a nested Map-data-type from
Hi, BB
Ideally you can do the query like: select key, value.percent from
mytable_data lateral view explode(audiences) f as key, value limit 3;
But there is a bug in HiveContext:
https://issues.apache.org/jira/browse/SPARK-5237
I am working on it now, hopefully make a patch soon.
Cheng
Not so sure about your question, but the SparkStrategies.scala and
Optimizer.scala is a good start if you want to get details of the join
implementation or optimization.
-Original Message-
From: Andrew Ash [mailto:and...@andrewash.com]
Sent: Friday, January 16, 2015 4:52 AM
To: Reynold
The Data Source API probably work for this purpose.
It support the column pruning and the Predicate Push Down:
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala
Examples also can be found in the unit test:
The log showed it failed in parsing, so the typo stuff shouldn’t be the root
cause. BUT I couldn’t reproduce that with master branch.
I did the test as follow:
sbt/sbt –Phadoop-2.3.0 –Phadoop-2.3 –Phive –Phive-0.13.1 hive/console
scala sql(“SELECT user_id FROM actions where
Cheng Hao created SPARK-5213:
Summary: Support the SQL Parser Registry
Key: SPARK-5213
URL: https://issues.apache.org/jira/browse/SPARK-5213
Project: Spark
Issue Type: New Feature
Cheng Hao created SPARK-5202:
Summary: HiveContext doesn't support the Variables Substitution
Key: SPARK-5202
URL: https://issues.apache.org/jira/browse/SPARK-5202
Project: Spark
Issue Type: Bug
[
https://issues.apache.org/jira/browse/SPARK-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated SPARK-5202:
-
Description:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+VariableSubstitution
[
https://issues.apache.org/jira/browse/SPARK-4636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao resolved SPARK-4636.
--
Resolution: Not a Problem
The answer with highest score seems not correct, in might tested
[
https://issues.apache.org/jira/browse/SPARK-4636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14270466#comment-14270466
]
Cheng Hao edited comment on SPARK-4636 at 1/9/15 2:57 AM
[
https://issues.apache.org/jira/browse/SPARK-4636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14270466#comment-14270466
]
Cheng Hao edited comment on SPARK-4636 at 1/9/15 2:56 AM
[
https://issues.apache.org/jira/browse/SPARK-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268861#comment-14268861
]
Cheng Hao commented on SPARK-5117:
--
Definitely we can do that then.
Hive Generic UDFs
[
https://issues.apache.org/jira/browse/SPARK-4366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266124#comment-14266124
]
Cheng Hao commented on SPARK-4366:
--
[~marmbrus] I've uploaded an draft design doc
[
https://issues.apache.org/jira/browse/SPARK-4366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated SPARK-4366:
-
Attachment: aggregatefunction_v1.pdf
Draft Design Doc.
Aggregation Optimization
[
https://issues.apache.org/jira/browse/SPARK-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao resolved SPARK-5117.
--
Resolution: Won't Fix
This IS NOT a bug of Spark SQL.
Hive changed the LPAD implementation since Hive
Can you paste the error log?
From: Dai, Kevin [mailto:yun...@ebay.com]
Sent: Monday, January 5, 2015 6:29 PM
To: user@spark.apache.org
Subject: Implement customized Join for SparkSQL
Hi, All
Suppose I want to join two tables A and B as follows:
Select * from A join B on A.id = B.id
A is a
Cheng Hao created SPARK-4967:
Summary: File name with comma will cause exception for
SQLContext.parquetFile
Key: SPARK-4967
URL: https://issues.apache.org/jira/browse/SPARK-4967
Project: Spark
multiple parquet files
for API sqlContext.parquetFile, we need to think how to support multiple paths
in some other way.
Cheng Hao
From: Michael Armbrust [mailto:mich...@databricks.com]
Sent: Thursday, December 25, 2014 1:01 PM
To: Daniel Siegmann
Cc: user@spark.apache.org
Subject: Re: Escape
I am wondering if we can provide more friendly API, other than configuration
for this purpose. What do you think Patrick?
Cheng Hao
-Original Message-
From: Patrick Wendell [mailto:pwend...@gmail.com]
Sent: Thursday, December 25, 2014 3:22 PM
To: Shao, Saisai
Cc: u...@spark.apache.org
I am wondering if we can provide more friendly API, other than configuration
for this purpose. What do you think Patrick?
Cheng Hao
-Original Message-
From: Patrick Wendell [mailto:pwend...@gmail.com]
Sent: Thursday, December 25, 2014 3:22 PM
To: Shao, Saisai
Cc: user@spark.apache.org
Cheng Hao created SPARK-4944:
Summary: Table Not Found exception in Create Table Like
registered RDD table
Key: SPARK-4944
URL: https://issues.apache.org/jira/browse/SPARK-4944
Project: Spark
Cheng Hao created SPARK-4945:
Summary: Add overwrite option support for
SchemaRDD.saveAsParquetFile
Key: SPARK-4945
URL: https://issues.apache.org/jira/browse/SPARK-4945
Project: Spark
Issue
Hi, Lam, I can confirm this is a bug with the latest master, and I filed a jira
issue for this:
https://issues.apache.org/jira/browse/SPARK-4944
Hope come with a solution soon.
Cheng Hao
From: Jerry Lam [mailto:chiling...@gmail.com]
Sent: Wednesday, December 24, 2014 4:26 AM
To: user
[
https://issues.apache.org/jira/browse/SPARK-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated SPARK-4367:
-
Summary: 2 Phase-shuffle to optimize the DISTINCT aggregation (was:
Process the distinct value before
[
https://issues.apache.org/jira/browse/SPARK-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated SPARK-4367:
-
Summary: Partial aggregation support the DISTINCT aggregation (was: 2
Phase-shuffle to optimize
Cheng Hao created SPARK-4904:
Summary: Remove the foldable checking in HiveGenericUdf.eval
Key: SPARK-4904
URL: https://issues.apache.org/jira/browse/SPARK-4904
Project: Spark
Issue Type
[
https://issues.apache.org/jira/browse/SPARK-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14254454#comment-14254454
]
Cheng Hao commented on SPARK-4367:
--
I am working on updating the Aggregation Function
[
https://issues.apache.org/jira/browse/HIVE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated HIVE-9004:
Attachment: (was: reset.patch)
Reset doesn't work for the default empty value entry
[
https://issues.apache.org/jira/browse/HIVE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated HIVE-9004:
Attachment: HIVE-9004.patch
Reset doesn't work for the default empty value entry
[
https://issues.apache.org/jira/browse/HIVE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14251138#comment-14251138
]
Cheng Hao commented on HIVE-9004:
-
Thank you [~szehon], updated.
Reset doesn't work
[
https://issues.apache.org/jira/browse/SPARK-4856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated SPARK-4856:
-
Description:
We have data like:
{quote}
TestSQLContext.sparkContext.parallelize(
{ip:27.31.100.29
[
https://issues.apache.org/jira/browse/SPARK-4856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated SPARK-4856:
-
Description:
We have data like:
{panel}
TestSQLContext.sparkContext.parallelize(
{ip:27.31.100.29
[
https://issues.apache.org/jira/browse/SPARK-4856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated SPARK-4856:
-
Description:
We have data like:
{noformat}
TestSQLContext.sparkContext.parallelize(
{ip:27.31.100.29
[
https://issues.apache.org/jira/browse/SPARK-4856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated SPARK-4856:
-
Description:
dWe have data like:
{noformat}
TestSQLContext.sparkContext.parallelize(
{ip:27.31.100.29
[
https://issues.apache.org/jira/browse/SPARK-4856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated SPARK-4856:
-
Description:
We have data like:
{noformat}
TestSQLContext.sparkContext.parallelize(
{ip:27.31.100.29
[
https://issues.apache.org/jira/browse/SPARK-4856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated SPARK-4856:
-
Description:
We have data like:
{noformat}
TestSQLContext.sparkContext.parallelize(
{ip:27.31.100.29
Cheng Hao created SPARK-4856:
Summary: Null empty string should not be considered as
StringType at begining in Json schema inferring
Key: SPARK-4856
URL: https://issues.apache.org/jira/browse/SPARK-4856
[
https://issues.apache.org/jira/browse/SPARK-4856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated SPARK-4856:
-
Description:
We have data like:
{panel}
TestSQLContext.sparkContext.parallelize(
{ip:27.31.100.29
[
https://issues.apache.org/jira/browse/SPARK-4856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated SPARK-4856:
-
Description:
We have data like:
{code:java}
TestSQLContext.sparkContext.parallelize(
{ip:27.31.100.29
As the error log shows, you may need to register it as:
sqlContext.rgisterFunction(“toHour”, toHour _)
The “_” means you are passing the function as parameter, not invoking it.
Cheng Hao
From: Xuelin Cao [mailto:xuelin...@yahoo.com.INVALID]
Sent: Monday, December 15, 2014 5:28 PM
To: User
Part of it can be found at:
https://github.com/apache/spark/pull/3429/files#diff-f88c3e731fcb17b1323b778807c35b38R34
Sorry it's a TO BE reviewed PR, but still should be informative.
Cheng Hao
-Original Message-
From: Alessandro Baretta [mailto:alexbare...@gmail.com]
Sent: Friday
It works exactly like Create Table As Select (CTAS) in Hive.
Cheng Hao
From: Anas Mosaad [mailto:anas.mos...@incorta.com]
Sent: Wednesday, December 10, 2014 11:59 AM
To: Michael Armbrust
Cc: Manoj Samel; user@spark.apache.org
Subject: Re: Can HiveContext be used without using Hive
I've created(reused) the PR https://github.com/apache/spark/pull/3336,
hopefully we can fix this regression.
Thanks for the reporting.
Cheng Hao
-Original Message-
From: Michael Armbrust [mailto:mich...@databricks.com]
Sent: Saturday, December 6, 2014 4:51 AM
To: kb
Cc: d
301 - 400 of 611 matches
Mail list logo