Cache in Spark
Hi Guys, May I know whether cache is enabled in spark by default? Thanks, Vinod
Re: Cache in Spark
Thanks Natu, If so,Can you please share me the Spark SQL query to check whether the given table is cached or not? if you know Thanks, Vinod On Fri, Oct 9, 2015 at 2:26 PM, Natu Lauchande <nlaucha...@gmail.com> wrote: > > I don't think so. > > Spark is not keeping the results in memory unless you tell it too. > > You have to explicitly call the cache method in your RDD: > linesWithSpark.cache() > > Thanks, > Natu > > > > > On Fri, Oct 9, 2015 at 10:47 AM, vinod kumar <vinodsachin...@gmail.com> > wrote: > >> Hi Guys, >> >> May I know whether cache is enabled in spark by default? >> >> Thanks, >> Vinod >> > >
Buffer Overflow exception
Hi, I am getting buffer over flow exception while using spark via thrifserver base.May I know how to overcome this? Code: HqlConnection con = new HqlConnection(localhost, 10001, HiveServer.HiveServer2); con.Open(); HqlCommand createCommand = new HqlCommand(tablequery, con); = Here table query was the query which I used to create a table createCommand.ExecuteNonQuery(); #.It seems spark works slower when compare to SQLServer.May I know the reason for that? My Case is: I have used the table called TestTable with 4 records in SQLServer and I executed a query and it returns the result in 1 sec. Then I have converted the same table as csv and exported it to spark and I executed the same query like in code but it takes more time almost 2 minute to return the results. May I know the reason for this slow process too? Thanks, Vinod
Functions in Spark SQL
Hi, May I know how to use the functions mentioned in http://spark.apache.org/docs/1.4.0/api/scala/index.html#org.apache.spark.sql.functions$ in spark sql? when I use like Select last(column) from tablename I am getting error like 15/07/27 03:00:00 INFO exec.FunctionRegistry: Unable to lookup UDF in metastore: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:NoSuchO bjectException(message:Function default.last does not exist)) java.lang.RuntimeException: Couldn't find function last Thanks, Vinod
Re: Functions in Spark SQL
Hi, Select last(product) from sampleTable Spark Version 1.3 -Vinod On Mon, Jul 27, 2015 at 3:48 AM, fightf...@163.com fightf...@163.com wrote: Hi, there I test with sqlContext.sql(select funcName(param1,param2,...) from tableName ) just worked fine. Would you like to paste your test code here ? And which version of Spark are u using ? Best, Sun. -- fightf...@163.com *From:* vinod kumar vinodsachin...@gmail.com *Date:* 2015-07-27 15:04 *To:* User user@spark.apache.org *Subject:* Functions in Spark SQL Hi, May I know how to use the functions mentioned in http://spark.apache.org/docs/1.4.0/api/scala/index.html#org.apache.spark.sql.functions$ in spark sql? when I use like Select last(column) from tablename I am getting error like 15/07/27 03:00:00 INFO exec.FunctionRegistry: Unable to lookup UDF in metastore: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:NoSuchO bjectException(message:Function default.last does not exist)) java.lang.RuntimeException: Couldn't find function last Thanks, Vinod
Create table from local machine
Hi, I am in need to create a table in spark.for that I have uploaded a csv file in HDFS and created a table using following query CREATE EXTERNAL table IF NOT EXISTS + tableName + (teams string,runs int) + ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LOCATION ' + hdfspath + '; May I know is there is anyway to create a table in spark without moving the File to HDFS? Thanks, Vinod
SQL Server to Spark
Hi Everyone, I am in need to use the table from MsSQLSERVER in SPARK.Any one please share me the optimized way for that? Thanks in advance, Vinod
Re: Spark Intro
Hi Akhil Is my choice to switch to spark is good? because I don't have enough information regards limitation and working environment of spark. I tried spark SQL but it seems it returns data slower than compared to MsSQL.( I have tested with data which has 4 records) On Tue, Jul 14, 2015 at 3:50 AM, Akhil Das ak...@sigmoidanalytics.com wrote: This is where you can get started https://spark.apache.org/docs/latest/sql-programming-guide.html Thanks Best Regards On Mon, Jul 13, 2015 at 3:54 PM, vinod kumar vinodsachin...@gmail.com wrote: Hi Everyone, I am developing application which handles bulk of data around millions(This may vary as per user's requirement) records.As of now I am using MsSqlServer as back-end and it works fine but when I perform some operation on large data I am getting overflow exceptions.I heard about spark that it was fastest computation engine Than SQL(Correct me if I am worng).so i thought to switch my application to spark.Is my decision is right? My User Enviroment is #.Window 8 #.Data in millions. #.Need to perform filtering and Sorting operations with aggregartions frequently.(for analystics) Thanks in-advance, Vinod
Re: Spark Intro
Thank you Hafsa On Tue, Jul 14, 2015 at 11:09 AM, Hafsa Asif hafsa.a...@matchinguu.com wrote: Hi, I was also in the same situation as we were using MySQL. Let me give some clearfications: 1. Spark provides a great methodology for big data analysis. So, if you want to make your system more analytical and want deep prepared analytical methods to analyze your data, then its a very good option. 2. If you want to get rid of old behavior of MS SQL and want to take fast responses from database with huge datasets then you can take any NOSQL database. In my case I select Aerospike for data storage and apply Spark analytical engine on it. It gives me really good response and I have a plan to go in real production with this combination. Best, Hafsa 2015-07-14 11:49 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: It might take some time to understand the echo system. I'm not sure about what kind of environment you are having (like #cores, Memory etc.), To start with, you can basically use a jdbc connector or dump your data as csv and load it into Spark and query it. You get the advantage of caching if you have more memory, also if you have enough cores 4 records are nothing. Thanks Best Regards On Tue, Jul 14, 2015 at 3:09 PM, vinod kumar vinodsachin...@gmail.com wrote: Hi Akhil Is my choice to switch to spark is good? because I don't have enough information regards limitation and working environment of spark. I tried spark SQL but it seems it returns data slower than compared to MsSQL.( I have tested with data which has 4 records) On Tue, Jul 14, 2015 at 3:50 AM, Akhil Das ak...@sigmoidanalytics.com wrote: This is where you can get started https://spark.apache.org/docs/latest/sql-programming-guide.html Thanks Best Regards On Mon, Jul 13, 2015 at 3:54 PM, vinod kumar vinodsachin...@gmail.com wrote: Hi Everyone, I am developing application which handles bulk of data around millions(This may vary as per user's requirement) records.As of now I am using MsSqlServer as back-end and it works fine but when I perform some operation on large data I am getting overflow exceptions.I heard about spark that it was fastest computation engine Than SQL(Correct me if I am worng).so i thought to switch my application to spark.Is my decision is right? My User Enviroment is #.Window 8 #.Data in millions. #.Need to perform filtering and Sorting operations with aggregartions frequently.(for analystics) Thanks in-advance, Vinod
Spark Intro
Hi Everyone, I am developing application which handles bulk of data around millions(This may vary as per user's requirement) records.As of now I am using MsSqlServer as back-end and it works fine but when I perform some operation on large data I am getting overflow exceptions.I heard about spark that it was fastest computation engine Than SQL(Correct me if I am worng).so i thought to switch my application to spark.Is my decision is right? My User Enviroment is #.Window 8 #.Data in millions. #.Need to perform filtering and Sorting operations with aggregartions frequently.(for analystics) Thanks in-advance, Vinod
Caching in spark
Hi Guys, Can any one please share me how to use caching feature of spark via spark sql queries? -Vinod
SPARK vs SQL
Hi Everyone, Is there is any document/material which compares spark with SQL server? If so please share me the details. Thanks, Vinod
Re: Data Processing speed SQL Vs SPARK
For records below 50,000 SQL is better right? On Fri, Jul 10, 2015 at 12:18 AM, ayan guha guha.a...@gmail.com wrote: With your load, either should be fine. I would suggest you to run couple of quick prototype. Best Ayan On Fri, Jul 10, 2015 at 2:06 PM, vinod kumar vinodsachin...@gmail.com wrote: Ayan, I would want to process a data which nearly around 5 records to 2L records(in flat). Is there is any scaling is there to decide what technology is best?either SQL or SPARK? On Thu, Jul 9, 2015 at 9:40 AM, ayan guha guha.a...@gmail.com wrote: It depends on workload. How much data you would want to process? On 9 Jul 2015 22:28, vinod kumar vinodsachin...@gmail.com wrote: Hi Everyone, I am new to spark. Am using SQL in my application to handle data in my application.I have a thought to move to spark now. Is data processing speed of spark better than SQL server? Thank, Vinod -- Best Regards, Ayan Guha
Re: Data Processing speed SQL Vs SPARK
Ayan, I would want to process a data which nearly around 5 records to 2L records(in flat). Is there is any scaling is there to decide what technology is best?either SQL or SPARK? On Thu, Jul 9, 2015 at 9:40 AM, ayan guha guha.a...@gmail.com wrote: It depends on workload. How much data you would want to process? On 9 Jul 2015 22:28, vinod kumar vinodsachin...@gmail.com wrote: Hi Everyone, I am new to spark. Am using SQL in my application to handle data in my application.I have a thought to move to spark now. Is data processing speed of spark better than SQL server? Thank, Vinod
Data Processing speed SQL Vs SPARK
Hi Everyone, I am new to spark. Am using SQL in my application to handle data in my application.I have a thought to move to spark now. Is data processing speed of spark better than SQL server? Thank, Vinod
Re: UDF in spark
Thanks Vishnu, When restart the service the UDF was not accessible by my query.I need to run the mentioned block again to use the UDF. Is there is any way to maintain UDF in sqlContext permanently? Thanks, Vinod On Wed, Jul 8, 2015 at 7:16 AM, VISHNU SUBRAMANIAN johnfedrickena...@gmail.com wrote: Hi, sqlContext.udf.register(udfname, functionname _) example: def square(x:Int):Int = { x * x} register udf as below sqlContext.udf.register(square,square _) Thanks, Vishnu On Wed, Jul 8, 2015 at 2:23 PM, vinod kumar vinodsachin...@gmail.com wrote: Hi Everyone, I am new to spark.may I know how to define and use User Define Function in SPARK SQL. I want to use defined UDF by using sql queries. My Environment Windows 8 spark 1.3.1 Warm Regards, Vinod
Re: UDF in spark
Thank you for quick response Vishnu, I have following doubts too. 1.Is there is anyway to upload files to HDFS programattically using c# language?. 2.Is there is any way to automatically load scala block of code (for UDF) when i start the spark service? -Vinod On Wed, Jul 8, 2015 at 7:57 AM, VISHNU SUBRAMANIAN johnfedrickena...@gmail.com wrote: HI Vinod, Yes If you want to use a scala or python function you need the block of code. Only Hive UDF's are available permanently. Thanks, Vishnu On Wed, Jul 8, 2015 at 5:17 PM, vinod kumar vinodsachin...@gmail.com wrote: Thanks Vishnu, When restart the service the UDF was not accessible by my query.I need to run the mentioned block again to use the UDF. Is there is any way to maintain UDF in sqlContext permanently? Thanks, Vinod On Wed, Jul 8, 2015 at 7:16 AM, VISHNU SUBRAMANIAN johnfedrickena...@gmail.com wrote: Hi, sqlContext.udf.register(udfname, functionname _) example: def square(x:Int):Int = { x * x} register udf as below sqlContext.udf.register(square,square _) Thanks, Vishnu On Wed, Jul 8, 2015 at 2:23 PM, vinod kumar vinodsachin...@gmail.com wrote: Hi Everyone, I am new to spark.may I know how to define and use User Define Function in SPARK SQL. I want to use defined UDF by using sql queries. My Environment Windows 8 spark 1.3.1 Warm Regards, Vinod
UDF in spark
Hi Everyone, I am new to spark.may I know how to define and use User Define Function in SPARK SQL. I want to use defined UDF by using sql queries. My Environment Windows 8 spark 1.3.1 Warm Regards, Vinod