Try use: org.apache.hadoop.hive.serde2.RegexSerDe
GP
On 27 Jul 2015, at 09:35, ZhuGe t...@outlook.commailto:t...@outlook.com
wrote:
Hi all:
I am testing the performance of hive on spark sql.
The existing table is created with
ROW FORMAT
SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
In Spark website it’s stated in the View After the Fact section
(https://spark.apache.org/docs/latest/monitoring.html) that you can point the
start-history-server.sh script to a directory in order do view the Web UI using
the logs as data source.
Is it possible to point that script to S3?
...@sigmoidanalytics.commailto:ak...@sigmoidanalytics.com wrote:
Not quiet sure, but try pointing the spark.history.fs.logDirectory to your s3
Thanks
Best Regards
On Tue, Jun 16, 2015 at 6:26 PM, Gianluca Privitera
gianluca.privite...@studio.unibo.itmailto:gianluca.privite...@studio.unibo.it
wrote:
In Spark
Hi,
I’ve got a problem with Spark Streaming and tshark.
While I’m running locally I have no problems with this code, but when I run it
on a EC2 cluster I get the exception shown just under the code.
def dissection(s: String): Seq[String] = {
try {
Process(hadoop command to create
You can find something in the API, nothing more than that I think for now.
Gianluca
On 25 Jun 2014, at 23:36, guxiaobo1982 guxiaobo1...@qq.com wrote:
Hi,
I want to know the full list of functions, syntax, features that Spark SQL
supports, is there some documentations.
Regards,
You can use ForeachRDD then access RDD data.
Hope this works for you.
Gianluca
On 12 Jun 2014, at 10:06, Wolfinger, Fred
fwolfin...@cyberpointllc.commailto:fwolfin...@cyberpointllc.com wrote:
Good morning.
I have a question related to Spark Streaming. I have reduced some data down to
a
If you are launching your application with spark-submit you can manually edit
the spark-class file to make it 1g as baseline.
It’s pretty easy to do and to figure out how once you open the file.
This worked for me even if it’s not a final solution of course.
Gianluca
On 12 Jun 2014, at 15:16,
Hi,
I'm think I may have encountered some kind of bug that at the moment prevents
the correct running of my application on a EC2 Cluster.
I'm saying that because the same exact code works wonderfully locally but has a
really strange behaviour on the cluster.
val uri = ssc.textFileStream(args(1)
: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi https://twitter.com/mayur_rustagi
On Fri, Jun 6, 2014 at 3:00 AM, Gianluca Privitera
gianluca.privite...@studio.unibo.it
mailto:gianluca.privite...@studio.unibo.it wrote:
Hi,
I've got a weird question but maybe someone
Is anyone experiencing problems with windows?
dstream1.print()
val dstream2 = dstream1.groupByKeyAndWindow(Seconds(60))
dstream2.print()
In my appslication the first print() prints out all the strings and
their keys, but after the window function everything is lost and
nothings gets printed.
Hi,
I've got a weird question but maybe someone has already dealt with it.
My Spark Streaming application needs to
- download a file from a S3 bucket,
- run a script with the file as input,
- create a DStream from this script output.
I've already got the second part done with the rdd.pipe() API
Hi everyone,
I would like to setup a very simple cluster (specifically using 2 micro
instances only) of Spark on EC2 and make it run a simple Spark Streaming
application I created.
Someone actually managed to do that?
Because after launching the scripts from this page:
12 matches
Mail list logo