Cache in Spark

2015-10-09 Thread vinod kumar
Hi Guys,

May I know whether cache is enabled in spark by default?

Thanks,
Vinod


Re: Cache in Spark

2015-10-09 Thread vinod kumar
Thanks Natu,

If so,Can you please share me the Spark SQL query to check whether the
given table is cached or not? if you know

Thanks,
Vinod

On Fri, Oct 9, 2015 at 2:26 PM, Natu Lauchande <nlaucha...@gmail.com> wrote:

>
> I don't think so.
>
> Spark is not keeping the results in memory unless you tell it too.
>
> You have to explicitly call the cache method in your RDD:
> linesWithSpark.cache()
>
> Thanks,
> Natu
>
>
>
>
> On Fri, Oct 9, 2015 at 10:47 AM, vinod kumar <vinodsachin...@gmail.com>
> wrote:
>
>> Hi Guys,
>>
>> May I know whether cache is enabled in spark by default?
>>
>> Thanks,
>> Vinod
>>
>
>


Buffer Overflow exception

2015-07-31 Thread vinod kumar
Hi,

I am getting buffer over flow exception while using spark via thrifserver
base.May I know how to overcome this?

Code:

 HqlConnection con = new HqlConnection(localhost, 10001,
HiveServer.HiveServer2);
con.Open();
HqlCommand createCommand = new HqlCommand(tablequery, con); =
Here table query was the query which I used to create a table
createCommand.ExecuteNonQuery();


#.It seems spark works slower when compare to SQLServer.May I know the
reason for that?
My Case is:
I have used the table called TestTable with 4 records in SQLServer
and I executed a query and it returns the result in 1 sec.
Then I have converted the same table as csv and exported it to spark and I
executed the same query like in code but it takes more time almost 2 minute
to return the results.
May I know the reason for this slow process too?

Thanks,
Vinod


Functions in Spark SQL

2015-07-27 Thread vinod kumar
Hi,

May I know how to use the functions mentioned in
http://spark.apache.org/docs/1.4.0/api/scala/index.html#org.apache.spark.sql.functions$
in spark sql?

when I use like

Select last(column) from tablename I am getting error like


15/07/27 03:00:00 INFO exec.FunctionRegistry: Unable to lookup UDF in
metastore:
 org.apache.hadoop.hive.ql.metadata.HiveException:
MetaException(message:NoSuchO
bjectException(message:Function default.last does not exist))
java.lang.RuntimeException: Couldn't find function last

Thanks,
Vinod


Re: Functions in Spark SQL

2015-07-27 Thread vinod kumar
Hi,

Select last(product) from sampleTable

Spark Version 1.3

-Vinod

On Mon, Jul 27, 2015 at 3:48 AM, fightf...@163.com fightf...@163.com
wrote:

 Hi, there

 I test with sqlContext.sql(select funcName(param1,param2,...) from
 tableName ) just worked fine.

 Would you like to paste your test code here ? And which version of Spark
 are u using ?

 Best,
 Sun.

 --
 fightf...@163.com


 *From:* vinod kumar vinodsachin...@gmail.com
 *Date:* 2015-07-27 15:04
 *To:* User user@spark.apache.org
 *Subject:* Functions in Spark SQL
 Hi,

 May I know how to use the functions mentioned in
 http://spark.apache.org/docs/1.4.0/api/scala/index.html#org.apache.spark.sql.functions$
 in spark sql?

 when I use like

 Select last(column) from tablename I am getting error like


 15/07/27 03:00:00 INFO exec.FunctionRegistry: Unable to lookup UDF in
 metastore:
  org.apache.hadoop.hive.ql.metadata.HiveException:
 MetaException(message:NoSuchO
 bjectException(message:Function default.last does not exist))
 java.lang.RuntimeException: Couldn't find function last

 Thanks,
 Vinod





Create table from local machine

2015-07-23 Thread vinod kumar
Hi,

I am in need to create a table in spark.for that I have uploaded a csv file
in HDFS and created a table using following query

CREATE EXTERNAL table IF NOT EXISTS  + tableName +  (teams string,runs
int)  + ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LOCATION ' +
hdfspath + ';

May I know is there is anyway to create a table in spark without moving the
File to HDFS?

Thanks,
Vinod


SQL Server to Spark

2015-07-23 Thread vinod kumar
Hi Everyone,

I am in need to use the table from MsSQLSERVER in SPARK.Any one please
share me the optimized way for that?

Thanks in advance,
Vinod


Re: Spark Intro

2015-07-14 Thread vinod kumar
Hi Akhil

Is my choice to switch to spark is good? because I don't have enough
information regards limitation and working environment of spark.
I tried spark SQL but it seems it returns data slower than compared to
MsSQL.( I have tested with data which has 4 records)



On Tue, Jul 14, 2015 at 3:50 AM, Akhil Das ak...@sigmoidanalytics.com
wrote:

 This is where you can get started
 https://spark.apache.org/docs/latest/sql-programming-guide.html

 Thanks
 Best Regards

 On Mon, Jul 13, 2015 at 3:54 PM, vinod kumar vinodsachin...@gmail.com
 wrote:


 Hi Everyone,

 I am developing application which handles bulk of data around
 millions(This may vary as per user's requirement) records.As of now I am
 using MsSqlServer as back-end and it works fine  but when I perform some
 operation on large data I am getting overflow exceptions.I heard about
 spark that it was fastest computation engine Than SQL(Correct me if I am
 worng).so i thought to switch my application to spark.Is my decision is
 right?
 My User Enviroment is
 #.Window 8
 #.Data in millions.
 #.Need to perform filtering and Sorting operations with aggregartions
 frequently.(for analystics)

 Thanks in-advance,

 Vinod





Re: Spark Intro

2015-07-14 Thread vinod kumar
Thank you Hafsa

On Tue, Jul 14, 2015 at 11:09 AM, Hafsa Asif hafsa.a...@matchinguu.com
wrote:

 Hi,
 I was also in the same situation as we were using MySQL. Let me give some
 clearfications:
 1. Spark provides a great methodology for big data analysis. So, if you
 want to make your system more analytical and want deep prepared analytical
 methods to analyze your data, then its a very good option.
 2. If you want to get rid of old behavior of MS SQL and want to take fast
 responses from database with huge datasets then you can take any NOSQL
 database.

 In my case I select Aerospike for data storage and apply Spark analytical
 engine on it. It gives me really good response and I have a plan to go in
 real production with this combination.

 Best,
 Hafsa

 2015-07-14 11:49 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:

 It might take some time to understand the echo system. I'm not sure about
 what kind of environment you are having (like #cores, Memory etc.), To
 start with, you can basically use a jdbc connector or dump your data as csv
 and load it into Spark and query it. You get the advantage of caching if
 you have more memory, also if you have enough cores 4 records are
 nothing.

 Thanks
 Best Regards

 On Tue, Jul 14, 2015 at 3:09 PM, vinod kumar vinodsachin...@gmail.com
 wrote:

 Hi Akhil

 Is my choice to switch to spark is good? because I don't have enough
 information regards limitation and working environment of spark.
 I tried spark SQL but it seems it returns data slower than compared to
 MsSQL.( I have tested with data which has 4 records)



 On Tue, Jul 14, 2015 at 3:50 AM, Akhil Das ak...@sigmoidanalytics.com
 wrote:

 This is where you can get started
 https://spark.apache.org/docs/latest/sql-programming-guide.html

 Thanks
 Best Regards

 On Mon, Jul 13, 2015 at 3:54 PM, vinod kumar vinodsachin...@gmail.com
 wrote:


 Hi Everyone,

 I am developing application which handles bulk of data around
 millions(This may vary as per user's requirement) records.As of now I am
 using MsSqlServer as back-end and it works fine  but when I perform some
 operation on large data I am getting overflow exceptions.I heard about
 spark that it was fastest computation engine Than SQL(Correct me if I am
 worng).so i thought to switch my application to spark.Is my decision is
 right?
 My User Enviroment is
 #.Window 8
 #.Data in millions.
 #.Need to perform filtering and Sorting operations with aggregartions
 frequently.(for analystics)

 Thanks in-advance,

 Vinod








Spark Intro

2015-07-13 Thread vinod kumar
Hi Everyone,

I am developing application which handles bulk of data around millions(This
may vary as per user's requirement) records.As of now I am using
MsSqlServer as back-end and it works fine  but when I perform some
operation on large data I am getting overflow exceptions.I heard about
spark that it was fastest computation engine Than SQL(Correct me if I am
worng).so i thought to switch my application to spark.Is my decision is
right?
My User Enviroment is
#.Window 8
#.Data in millions.
#.Need to perform filtering and Sorting operations with aggregartions
frequently.(for analystics)

Thanks in-advance,

Vinod


Caching in spark

2015-07-09 Thread vinod kumar
Hi Guys,

Can any one please share me how to use caching feature of spark via spark
sql queries?

-Vinod


SPARK vs SQL

2015-07-09 Thread vinod kumar
Hi Everyone,

Is there is any document/material which compares spark with SQL server?

If so please share me the details.

Thanks,
Vinod


Re: Data Processing speed SQL Vs SPARK

2015-07-09 Thread vinod kumar
For records below 50,000 SQL is better right?


On Fri, Jul 10, 2015 at 12:18 AM, ayan guha guha.a...@gmail.com wrote:

 With your load, either should be fine.

 I would suggest you to run couple of quick prototype.

 Best
 Ayan

 On Fri, Jul 10, 2015 at 2:06 PM, vinod kumar vinodsachin...@gmail.com
 wrote:

 Ayan,

 I would want to process a data which  nearly around 5 records to 2L
 records(in flat).

 Is there is any scaling is there to decide what technology is best?either
 SQL or SPARK?



 On Thu, Jul 9, 2015 at 9:40 AM, ayan guha guha.a...@gmail.com wrote:

 It depends on workload. How much data you would want to process?
 On 9 Jul 2015 22:28, vinod kumar vinodsachin...@gmail.com wrote:

 Hi Everyone,

 I am new to spark.

 Am using SQL in my application to handle data in my application.I have
 a thought to move to spark now.

 Is data processing speed of spark better than SQL server?

 Thank,
 Vinod





 --
 Best Regards,
 Ayan Guha



Re: Data Processing speed SQL Vs SPARK

2015-07-09 Thread vinod kumar
Ayan,

I would want to process a data which  nearly around 5 records to 2L
records(in flat).

Is there is any scaling is there to decide what technology is best?either
SQL or SPARK?



On Thu, Jul 9, 2015 at 9:40 AM, ayan guha guha.a...@gmail.com wrote:

 It depends on workload. How much data you would want to process?
 On 9 Jul 2015 22:28, vinod kumar vinodsachin...@gmail.com wrote:

 Hi Everyone,

 I am new to spark.

 Am using SQL in my application to handle data in my application.I have a
 thought to move to spark now.

 Is data processing speed of spark better than SQL server?

 Thank,
 Vinod




Data Processing speed SQL Vs SPARK

2015-07-09 Thread vinod kumar
Hi Everyone,

I am new to spark.

Am using SQL in my application to handle data in my application.I have a
thought to move to spark now.

Is data processing speed of spark better than SQL server?

Thank,
Vinod


Re: UDF in spark

2015-07-08 Thread vinod kumar
Thanks Vishnu,

When restart the service the UDF was not accessible by my query.I need to
run the mentioned block again to use the UDF.
Is there is any way to maintain UDF in sqlContext permanently?

Thanks,
Vinod

On Wed, Jul 8, 2015 at 7:16 AM, VISHNU SUBRAMANIAN 
johnfedrickena...@gmail.com wrote:

 Hi,

 sqlContext.udf.register(udfname, functionname _)

 example:

 def square(x:Int):Int = { x * x}

 register udf as below

 sqlContext.udf.register(square,square _)

 Thanks,
 Vishnu

 On Wed, Jul 8, 2015 at 2:23 PM, vinod kumar vinodsachin...@gmail.com
 wrote:

 Hi Everyone,

 I am new to spark.may I know how to define and use User Define Function
 in SPARK SQL.

 I want to use defined UDF by using sql queries.

 My Environment
 Windows 8
 spark 1.3.1

 Warm Regards,
 Vinod






Re: UDF in spark

2015-07-08 Thread vinod kumar
Thank you for quick response Vishnu,

I have following doubts too.

1.Is there is anyway to upload files to HDFS programattically using c#
language?.
2.Is there is any way to automatically load scala block of code (for UDF)
when i start the spark service?

-Vinod

On Wed, Jul 8, 2015 at 7:57 AM, VISHNU SUBRAMANIAN 
johnfedrickena...@gmail.com wrote:

 HI Vinod,

 Yes If you want to use a scala or python function you need the block of
 code.

 Only Hive UDF's are available permanently.

 Thanks,
 Vishnu

 On Wed, Jul 8, 2015 at 5:17 PM, vinod kumar vinodsachin...@gmail.com
 wrote:

 Thanks Vishnu,

 When restart the service the UDF was not accessible by my query.I need to
 run the mentioned block again to use the UDF.
 Is there is any way to maintain UDF in sqlContext permanently?

 Thanks,
 Vinod

 On Wed, Jul 8, 2015 at 7:16 AM, VISHNU SUBRAMANIAN 
 johnfedrickena...@gmail.com wrote:

 Hi,

 sqlContext.udf.register(udfname, functionname _)

 example:

 def square(x:Int):Int = { x * x}

 register udf as below

 sqlContext.udf.register(square,square _)

 Thanks,
 Vishnu

 On Wed, Jul 8, 2015 at 2:23 PM, vinod kumar vinodsachin...@gmail.com
 wrote:

 Hi Everyone,

 I am new to spark.may I know how to define and use User Define Function
 in SPARK SQL.

 I want to use defined UDF by using sql queries.

 My Environment
 Windows 8
 spark 1.3.1

 Warm Regards,
 Vinod








UDF in spark

2015-07-08 Thread vinod kumar
Hi Everyone,

I am new to spark.may I know how to define and use User Define Function in
SPARK SQL.

I want to use defined UDF by using sql queries.

My Environment
Windows 8
spark 1.3.1

Warm Regards,
Vinod