():
print i
Result:
Row(name=u'A', age=30, other=u'A30')
Row(name=u'B', age=15, other=u'B15')
Row(name=u'C', age=20, other=u'C200')
On Sat, Apr 25, 2015 at 2:48 PM, Wenlei Xie wenlei@gmail.com wrote:
Sure. A simple example of data would be (there might be many other
columns)
Name
Hi,
I am trying to answer a simple query with SparkSQL over the Parquet file.
When execute the query several times, the first run will take about 2s
while the later run will take 0.1s.
By looking at the log file it seems the later runs doesn't load the data
from disk. However, I didn't enable
...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
--
Wenlei Xie (谢文磊)
Ph.D. Candidate
Department of Computer Science
456 Gates Hall, Cornell University
Ithaca, NY 14853, USA
Email: wenlei@gmail.com
Hi,
I am wondering how should we understand the running time of SparkSQL
queries? For example the physical query plan and the running time on each
stage? Is there any guide talking about this?
Thank you!
Best,
Wenlei
Use Object[] in Java just works :).
On Fri, Apr 24, 2015 at 4:56 PM, Wenlei Xie wenlei@gmail.com wrote:
Hi,
I am wondering if there is any way to create a Row in SparkSQL 1.2 in Java
by using an List? It looks like
ArrayListObject something;
Row.create(something)
will create a row
.
On Sat, Apr 18, 2015 at 6:20 PM, Wenlei Xie wenlei@gmail.com wrote:
Hi,
I am wondering the mechanism that determines the number of partitions
created by SparkContext.sequenceFile ?
For example, although my file has only 4 splits, Spark would create 16
partitions for it. Is it determined
Hi,
I would like to answer the following customized aggregation query on Spark
SQL
1. Group the table by the value of Name
2. For each group, choose the tuple with the max value of Age (the ages are
distinct for every name)
I am wondering what's the best way to do it on Spark SQL? Should I use
Hi,
I am wondering if there is any way to create a Row in SparkSQL 1.2 in Java
by using an List? It looks like
ArrayListObject something;
Row.create(something)
will create a row with single column (and the single column contains the
array)
Best,
Wenlei
Hi,
I am wondering the mechanism that determines the number of partitions
created by SparkContext.sequenceFile ?
For example, although my file has only 4 splits, Spark would create 16
partitions for it. Is it determined by the file size? Is there any way to
control it? (Looks like I can only
Hi,
I am currently testing my application with Spark under local mode, and I
set the master to be local[4]. One thing I note is that when there is
groupBy/reduceBy operation involved, the CPU usage can sometimes be around
600% to 800%. I am wondering if this is expected? (As only 4 worker threads
10 matches
Mail list logo