[jira] [Commented] (SPARK-8773) Throw type mismatch in check analysis for expressions with expected input types defined

2015-07-02 Thread Akhil Thatipamula (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14611623#comment-14611623
 ] 

Akhil Thatipamula commented on SPARK-8773:
--

[~rxin] aren't we checking that already,
|case e: Expression if e.checkInputDataTypes().isFailure|
am I missing somthing? 


 Throw type mismatch in check analysis for expressions with expected input 
 types defined
 ---

 Key: SPARK-8773
 URL: https://issues.apache.org/jira/browse/SPARK-8773
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8745) Remove GenerateMutableProjection

2015-07-02 Thread Akhil Thatipamula (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612486#comment-14612486
 ] 

Akhil Thatipamula commented on SPARK-8745:
--

okay for me.

 Remove GenerateMutableProjection
 

 Key: SPARK-8745
 URL: https://issues.apache.org/jira/browse/SPARK-8745
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Reporter: Reynold Xin
Assignee: Davies Liu

 Based on discussion offline with [~marmbrus], we should remove 
 GenerateMutableProjection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-8745) Remove GenerateMutableProjection

2015-07-02 Thread Akhil Thatipamula (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612486#comment-14612486
 ] 

Akhil Thatipamula edited comment on SPARK-8745 at 7/2/15 8:40 PM:
--

Fine with me.


was (Author: 6133d):
okay for me.

 Remove GenerateMutableProjection
 

 Key: SPARK-8745
 URL: https://issues.apache.org/jira/browse/SPARK-8745
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Reporter: Reynold Xin
Assignee: Davies Liu

 Based on discussion offline with [~marmbrus], we should remove 
 GenerateMutableProjection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8745) Remove GenerateMutableProjection

2015-06-30 Thread Akhil Thatipamula (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609566#comment-14609566
 ] 

Akhil Thatipamula commented on SPARK-8745:
--

[~rxin] I will work on this.

 Remove GenerateMutableProjection
 

 Key: SPARK-8745
 URL: https://issues.apache.org/jira/browse/SPARK-8745
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Reporter: Reynold Xin

 Based on discussion offline with [~marmbrus], we should remove 
 GenerateMutableProjection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8608) After initializing a DataFrame with random columns and a seed, df.show should return same value

2015-06-26 Thread Akhil Thatipamula (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602543#comment-14602543
 ] 

Akhil Thatipamula commented on SPARK-8608:
--

[~brkyvz] more description would be helpful.

 After initializing a DataFrame with random columns and a seed, df.show should 
 return same value
 ---

 Key: SPARK-8608
 URL: https://issues.apache.org/jira/browse/SPARK-8608
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 1.4.0, 1.4.1
Reporter: Burak Yavuz
Priority: Critical





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7993) Improve DataFrame.show() output

2015-06-03 Thread Akhil Thatipamula (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14570704#comment-14570704
 ] 

Akhil Thatipamula commented on SPARK-7993:
--

[~rxin] 
I have come up with 2 methods:
a)We can get the class type of a particlar data cell of a row using 'getClass' 
and if it comes under 'scala.collection', which basically is the set of all 
'container data types' and then act accordingly.
b)We can check whether the data type for a given column is 
primitive[StringType, FloatType, IntegerType, ByteType, ShortType, DoubleType, 
LongType, BinaryType, BooleanType, DateType, DecimalType, TimestampType] 
according to classification given in DataType.scala, and act accordingly.
I have implemented both.
Which is better method?

 Improve DataFrame.show() output
 ---

 Key: SPARK-7993
 URL: https://issues.apache.org/jira/browse/SPARK-7993
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin
Priority: Blocker
  Labels: starter

 1. Each column should be at the minimum 3 characters wide. Right now if the 
 widest value is 1, it is just 1 char wide, which looks ugly. Example below:
 2. If a DataFrame have more than N number of rows (N = 20 by default for 
 show), at the end we should display a message like only showing the top 20 
 rows.
 {code}
 +--+--+-+
 | a| b|c|
 +--+--+-+
 | 1| 2|3|
 | 1| 2|1|
 | 1| 2|3|
 | 3| 6|3|
 | 1| 2|3|
 | 5|10|1|
 | 1| 2|3|
 | 7|14|3|
 | 1| 2|3|
 | 9|18|1|
 | 1| 2|3|
 |11|22|3|
 | 1| 2|3|
 |13|26|1|
 | 1| 2|3|
 |15|30|3|
 | 1| 2|3|
 |17|34|1|
 | 1| 2|3|
 |19|38|3|
 +--+--+-+
 only showing top 20 rows    add this at the end
 {code}
 3. For array values, instead of printing ArrayBuffer, we should just print 
 square brackets:
 {code}
 +--+--+-+
 |   a_freqItems|   b_freqItems|  c_freqItems|
 +--+--+-+
 |ArrayBuffer(11, 1)|ArrayBuffer(2, 22)|ArrayBuffer(1, 3)|
 +--+--+-+
 {code}
 should be
 {code}
 +---+---+---+
 |a_freqItems|b_freqItems|c_freqItems|
 +---+---+---+
 |[11, 1]|[2, 22]| [1, 3]|
 +---+---+---+
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7993) Improve DataFrame.show() output

2015-06-02 Thread Akhil Thatipamula (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14568717#comment-14568717
 ] 

Akhil Thatipamula commented on SPARK-7993:
--

I am planning to check whether the data type for a given column is primitive. 
And if it turns out to be non primitive, I am modifying the value of 
string['cell.toString']. 

Is that legitimate?

 Improve DataFrame.show() output
 ---

 Key: SPARK-7993
 URL: https://issues.apache.org/jira/browse/SPARK-7993
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin
Priority: Blocker
  Labels: starter

 1. Each column should be at the minimum 3 characters wide. Right now if the 
 widest value is 1, it is just 1 char wide, which looks ugly. Example below:
 2. If a DataFrame have more than N number of rows (N = 20 by default for 
 show), at the end we should display a message like only showing the top 20 
 rows.
 {code}
 +--+--+-+
 | a| b|c|
 +--+--+-+
 | 1| 2|3|
 | 1| 2|1|
 | 1| 2|3|
 | 3| 6|3|
 | 1| 2|3|
 | 5|10|1|
 | 1| 2|3|
 | 7|14|3|
 | 1| 2|3|
 | 9|18|1|
 | 1| 2|3|
 |11|22|3|
 | 1| 2|3|
 |13|26|1|
 | 1| 2|3|
 |15|30|3|
 | 1| 2|3|
 |17|34|1|
 | 1| 2|3|
 |19|38|3|
 +--+--+-+
 only showing top 20 rows    add this at the end
 {code}
 3. For array values, instead of printing ArrayBuffer, we should just print 
 square brackets:
 {code}
 +--+--+-+
 |   a_freqItems|   b_freqItems|  c_freqItems|
 +--+--+-+
 |ArrayBuffer(11, 1)|ArrayBuffer(2, 22)|ArrayBuffer(1, 3)|
 +--+--+-+
 {code}
 should be
 {code}
 +---+---+---+
 |a_freqItems|b_freqItems|c_freqItems|
 +---+---+---+
 |[11, 1]|[2, 22]| [1, 3]|
 +---+---+---+
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7993) Improve DataFrame.show() output

2015-06-01 Thread Akhil Thatipamula (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14567135#comment-14567135
 ] 

Akhil Thatipamula commented on SPARK-7993:
--

[~rxin] Does the 3rd modification effect 'List' as well.
For instance,
++
|modules|
++
|List(mllib, sql, ...|
++
should it be
++
|   modules|
++
| [mllib, sql, ...|
++
?

 Improve DataFrame.show() output
 ---

 Key: SPARK-7993
 URL: https://issues.apache.org/jira/browse/SPARK-7993
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin
Priority: Blocker
  Labels: starter

 1. Each column should be at the minimum 3 characters wide. Right now if the 
 widest value is 1, it is just 1 char wide, which looks ugly. Example below:
 2. If a DataFrame have more than N number of rows (N = 20 by default for 
 show), at the end we should display a message like only showing the top 20 
 rows.
 {code}
 +--+--+-+
 | a| b|c|
 +--+--+-+
 | 1| 2|3|
 | 1| 2|1|
 | 1| 2|3|
 | 3| 6|3|
 | 1| 2|3|
 | 5|10|1|
 | 1| 2|3|
 | 7|14|3|
 | 1| 2|3|
 | 9|18|1|
 | 1| 2|3|
 |11|22|3|
 | 1| 2|3|
 |13|26|1|
 | 1| 2|3|
 |15|30|3|
 | 1| 2|3|
 |17|34|1|
 | 1| 2|3|
 |19|38|3|
 +--+--+-+
 only showing top 20 rows    add this at the end
 {code}
 3. For array values, instead of printing ArrayBuffer, we should just print 
 square brackets:
 {code}
 +--+--+-+
 |   a_freqItems|   b_freqItems|  c_freqItems|
 +--+--+-+
 |ArrayBuffer(11, 1)|ArrayBuffer(2, 22)|ArrayBuffer(1, 3)|
 +--+--+-+
 {code}
 should be
 {code}
 +---+---+---+
 |a_freqItems|b_freqItems|c_freqItems|
 +---+---+---+
 |[11, 1]|[2, 22]| [1, 3]|
 +---+---+---+
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7993) Improve DataFrame.show() output

2015-06-01 Thread Akhil Thatipamula (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14567017#comment-14567017
 ] 

Akhil Thatipamula commented on SPARK-7993:
--

[~rxin] I will work on this.

 Improve DataFrame.show() output
 ---

 Key: SPARK-7993
 URL: https://issues.apache.org/jira/browse/SPARK-7993
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin
Priority: Blocker
  Labels: starter

 1. Each column should be at the minimum 3 characters wide. Right now if the 
 widest value is 1, it is just 1 char wide, which looks ugly. Example below:
 2. If a DataFrame have more than N number of rows (N = 20 by default for 
 show), at the end we should display a message like only showing the top 20 
 rows.
 {code}
 +--+--+-+
 | a| b|c|
 +--+--+-+
 | 1| 2|3|
 | 1| 2|1|
 | 1| 2|3|
 | 3| 6|3|
 | 1| 2|3|
 | 5|10|1|
 | 1| 2|3|
 | 7|14|3|
 | 1| 2|3|
 | 9|18|1|
 | 1| 2|3|
 |11|22|3|
 | 1| 2|3|
 |13|26|1|
 | 1| 2|3|
 |15|30|3|
 | 1| 2|3|
 |17|34|1|
 | 1| 2|3|
 |19|38|3|
 +--+--+-+
 only showing top 20 rows    add this at the end
 {code}
 3. For array values, instead of printing ArrayBuffer, we should just print 
 square brackets:
 {code}
 +--+--+-+
 |   a_freqItems|   b_freqItems|  c_freqItems|
 +--+--+-+
 |ArrayBuffer(11, 1)|ArrayBuffer(2, 22)|ArrayBuffer(1, 3)|
 +--+--+-+
 {code}
 should be
 {code}
 +---+---+---+
 |a_freqItems|b_freqItems|c_freqItems|
 +---+---+---+
 |[11, 1]|[2, 22]| [1, 3]|
 +---+---+---+
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-7993) Improve DataFrame.show() output

2015-06-01 Thread Akhil Thatipamula (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14567135#comment-14567135
 ] 

Akhil Thatipamula edited comment on SPARK-7993 at 6/1/15 10:25 AM:
---

[~rxin] Does the 3rd modification effect 'List' as well.
For instance,
++
|modules|
++
|List(mllib, sql, ...|
++
should it be?
++
|   modules|
++
| [mllib, sql, ...|
++



was (Author: 6133d):
[~rxin] Does the 3rd modification effect 'List' as well.
For instance,
++
|modules|
++
|List(mllib, sql, ...|
++
should it be
++
|   modules|
++
| [mllib, sql, ...|
++
?

 Improve DataFrame.show() output
 ---

 Key: SPARK-7993
 URL: https://issues.apache.org/jira/browse/SPARK-7993
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin
Priority: Blocker
  Labels: starter

 1. Each column should be at the minimum 3 characters wide. Right now if the 
 widest value is 1, it is just 1 char wide, which looks ugly. Example below:
 2. If a DataFrame have more than N number of rows (N = 20 by default for 
 show), at the end we should display a message like only showing the top 20 
 rows.
 {code}
 +--+--+-+
 | a| b|c|
 +--+--+-+
 | 1| 2|3|
 | 1| 2|1|
 | 1| 2|3|
 | 3| 6|3|
 | 1| 2|3|
 | 5|10|1|
 | 1| 2|3|
 | 7|14|3|
 | 1| 2|3|
 | 9|18|1|
 | 1| 2|3|
 |11|22|3|
 | 1| 2|3|
 |13|26|1|
 | 1| 2|3|
 |15|30|3|
 | 1| 2|3|
 |17|34|1|
 | 1| 2|3|
 |19|38|3|
 +--+--+-+
 only showing top 20 rows    add this at the end
 {code}
 3. For array values, instead of printing ArrayBuffer, we should just print 
 square brackets:
 {code}
 +--+--+-+
 |   a_freqItems|   b_freqItems|  c_freqItems|
 +--+--+-+
 |ArrayBuffer(11, 1)|ArrayBuffer(2, 22)|ArrayBuffer(1, 3)|
 +--+--+-+
 {code}
 should be
 {code}
 +---+---+---+
 |a_freqItems|b_freqItems|c_freqItems|
 +---+---+---+
 |[11, 1]|[2, 22]| [1, 3]|
 +---+---+---+
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-7993) Improve DataFrame.show() output

2015-06-01 Thread Akhil Thatipamula (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14567135#comment-14567135
 ] 

Akhil Thatipamula edited comment on SPARK-7993 at 6/1/15 10:26 AM:
---

[~rxin] Does the 3rd modification effect 'List' as well.
For instance,
|List(mllib, sql, ...|
should it be?
| [mllib, sql, ...|



was (Author: 6133d):
[~rxin] Does the 3rd modification effect 'List' as well.
For instance,
++
|modules|
++
|List(mllib, sql, ...|
++
should it be?
++
|   modules|
++
| [mllib, sql, ...|
++


 Improve DataFrame.show() output
 ---

 Key: SPARK-7993
 URL: https://issues.apache.org/jira/browse/SPARK-7993
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin
Priority: Blocker
  Labels: starter

 1. Each column should be at the minimum 3 characters wide. Right now if the 
 widest value is 1, it is just 1 char wide, which looks ugly. Example below:
 2. If a DataFrame have more than N number of rows (N = 20 by default for 
 show), at the end we should display a message like only showing the top 20 
 rows.
 {code}
 +--+--+-+
 | a| b|c|
 +--+--+-+
 | 1| 2|3|
 | 1| 2|1|
 | 1| 2|3|
 | 3| 6|3|
 | 1| 2|3|
 | 5|10|1|
 | 1| 2|3|
 | 7|14|3|
 | 1| 2|3|
 | 9|18|1|
 | 1| 2|3|
 |11|22|3|
 | 1| 2|3|
 |13|26|1|
 | 1| 2|3|
 |15|30|3|
 | 1| 2|3|
 |17|34|1|
 | 1| 2|3|
 |19|38|3|
 +--+--+-+
 only showing top 20 rows    add this at the end
 {code}
 3. For array values, instead of printing ArrayBuffer, we should just print 
 square brackets:
 {code}
 +--+--+-+
 |   a_freqItems|   b_freqItems|  c_freqItems|
 +--+--+-+
 |ArrayBuffer(11, 1)|ArrayBuffer(2, 22)|ArrayBuffer(1, 3)|
 +--+--+-+
 {code}
 should be
 {code}
 +---+---+---+
 |a_freqItems|b_freqItems|c_freqItems|
 +---+---+---+
 |[11, 1]|[2, 22]| [1, 3]|
 +---+---+---+
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7993) Improve DataFrame.show() output

2015-06-01 Thread Akhil Thatipamula (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14567040#comment-14567040
 ] 

Akhil Thatipamula commented on SPARK-7993:
--

Thanks for mentioning, I will take of care of that.

 Improve DataFrame.show() output
 ---

 Key: SPARK-7993
 URL: https://issues.apache.org/jira/browse/SPARK-7993
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin
Priority: Blocker
  Labels: starter

 1. Each column should be at the minimum 3 characters wide. Right now if the 
 widest value is 1, it is just 1 char wide, which looks ugly. Example below:
 2. If a DataFrame have more than N number of rows (N = 20 by default for 
 show), at the end we should display a message like only showing the top 20 
 rows.
 {code}
 +--+--+-+
 | a| b|c|
 +--+--+-+
 | 1| 2|3|
 | 1| 2|1|
 | 1| 2|3|
 | 3| 6|3|
 | 1| 2|3|
 | 5|10|1|
 | 1| 2|3|
 | 7|14|3|
 | 1| 2|3|
 | 9|18|1|
 | 1| 2|3|
 |11|22|3|
 | 1| 2|3|
 |13|26|1|
 | 1| 2|3|
 |15|30|3|
 | 1| 2|3|
 |17|34|1|
 | 1| 2|3|
 |19|38|3|
 +--+--+-+
 only showing top 20 rows    add this at the end
 {code}
 3. For array values, instead of printing ArrayBuffer, we should just print 
 square brackets:
 {code}
 +--+--+-+
 |   a_freqItems|   b_freqItems|  c_freqItems|
 +--+--+-+
 |ArrayBuffer(11, 1)|ArrayBuffer(2, 22)|ArrayBuffer(1, 3)|
 +--+--+-+
 {code}
 should be
 {code}
 +---+---+---+
 |a_freqItems|b_freqItems|c_freqItems|
 +---+---+---+
 |[11, 1]|[2, 22]| [1, 3]|
 +---+---+---+
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7012) Add support for NOT NULL modifier for column definitions on DDLParser

2015-05-26 Thread Akhil Thatipamula (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558949#comment-14558949
 ] 

Akhil Thatipamula commented on SPARK-7012:
--

[~smolav] I worked over this for a while, I think to add such a modifier we 
first need a method(CREATE TABLE). We can run such query using a HiveContext, 
but HiveContext uses parser provided by hive-ql, so we cannot improvise it. 
Where as SQLContext uses 'SqlParser.scala', which don't have any method such 
method as 'CREATE TABLE'. 

 Add support for NOT NULL modifier for column definitions on DDLParser
 -

 Key: SPARK-7012
 URL: https://issues.apache.org/jira/browse/SPARK-7012
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.3.0
Reporter: Santiago M. Mola
Priority: Minor
  Labels: easyfix

 Add support for NOT NULL modifier for column definitions on DDLParser. This 
 would add support for the following syntax:
 CREATE TEMPORARY TABLE (field INTEGER NOT NULL) ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-7012) Add support for NOT NULL modifier for column definitions on DDLParser

2015-05-26 Thread Akhil Thatipamula (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558949#comment-14558949
 ] 

Akhil Thatipamula edited comment on SPARK-7012 at 5/26/15 10:17 AM:


[~smolav] I worked over this for a while, I think to add such a modifier we 
first need a method(CREATE TABLE). We can run such query using a HiveContext, 
but HiveContext uses parser provided by hive-ql, so we cannot improvise it. 
Where as SQLContext uses 'SqlParser.scala', which don't have any such method as 
'CREATE TABLE'. 


was (Author: 6133d):
[~smolav] I worked over this for a while, I think to add such a modifier we 
first need a method(CREATE TABLE). We can run such query using a HiveContext, 
but HiveContext uses parser provided by hive-ql, so we cannot improvise it. 
Where as SQLContext uses 'SqlParser.scala', which don't have any method such 
method as 'CREATE TABLE'. 

 Add support for NOT NULL modifier for column definitions on DDLParser
 -

 Key: SPARK-7012
 URL: https://issues.apache.org/jira/browse/SPARK-7012
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.3.0
Reporter: Santiago M. Mola
Priority: Minor
  Labels: easyfix

 Add support for NOT NULL modifier for column definitions on DDLParser. This 
 would add support for the following syntax:
 CREATE TEMPORARY TABLE (field INTEGER NOT NULL) ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7327) DataFrame show() method doesn't like empty dataframes

2015-05-14 Thread Akhil Thatipamula (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543629#comment-14543629
 ] 

Akhil Thatipamula commented on SPARK-7327:
--

@Oliver I have checked, but i haven't come across the issue, every thing seems 
to be working well. Can you post the exact code you used.

 DataFrame show() method doesn't like empty dataframes
 -

 Key: SPARK-7327
 URL: https://issues.apache.org/jira/browse/SPARK-7327
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.3.1
Reporter: Olivier Girardot
Priority: Minor

 For an empty DataFrame (for exemple after a filter) any call to show() ends 
 up with : 
 {code}
 java.util.MissingFormatWidthException: -0s
   at java.util.Formatter$FormatSpecifier.checkGeneral(Formatter.java:2906)
   at java.util.Formatter$FormatSpecifier.init(Formatter.java:2680)
   at java.util.Formatter.parse(Formatter.java:2528)
   at java.util.Formatter.format(Formatter.java:2469)
   at java.util.Formatter.format(Formatter.java:2423)
   at java.lang.String.format(String.java:2790)
   at 
 org.apache.spark.sql.DataFrame$$anonfun$showString$2$$anonfun$apply$4.apply(DataFrame.scala:200)
   at 
 org.apache.spark.sql.DataFrame$$anonfun$showString$2$$anonfun$apply$4.apply(DataFrame.scala:199)
   at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
   at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
   at 
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
   at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
   at scala.collection.AbstractTraversable.map(Traversable.scala:105)
   at 
 org.apache.spark.sql.DataFrame$$anonfun$showString$2.apply(DataFrame.scala:199)
   at 
 org.apache.spark.sql.DataFrame$$anonfun$showString$2.apply(DataFrame.scala:198)
   at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
   at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
   at scala.collection.mutable.ArraySeq.foreach(ArraySeq.scala:73)
   at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
   at scala.collection.AbstractTraversable.map(Traversable.scala:105)
   at org.apache.spark.sql.DataFrame.showString(DataFrame.scala:198)
   at org.apache.spark.sql.DataFrame.show(DataFrame.scala:314)
   at org.apache.spark.sql.DataFrame.show(DataFrame.scala:320)
 {code}
 If no-one takes it by next friday, I'll fix it, the problem seems to come 
 from the colWidths method :
 {code}
// Compute the width of each column
 val colWidths = Array.fill(numCols)(0)
 for (row - rows) {
   for ((cell, i) - row.zipWithIndex) {
 colWidths(i) = math.max(colWidths(i), cell.length)
   }
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-7012) Add support for NOT NULL modifier for column definitions on DDLParser

2015-05-13 Thread Akhil Thatipamula (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543208#comment-14543208
 ] 

Akhil Thatipamula edited comment on SPARK-7012 at 5/14/15 5:18 AM:
---

Can any one eloberate  this issue??


was (Author: 6133d):
Can any one eloberate on this issue??

 Add support for NOT NULL modifier for column definitions on DDLParser
 -

 Key: SPARK-7012
 URL: https://issues.apache.org/jira/browse/SPARK-7012
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.3.0
Reporter: Santiago M. Mola
Priority: Minor
  Labels: easyfix

 Add support for NOT NULL modifier for column definitions on DDLParser. This 
 would add support for the following syntax:
 CREATE TEMPORARY TABLE (field INTEGER NOT NULL) ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7012) Add support for NOT NULL modifier for column definitions on DDLParser

2015-05-13 Thread Akhil Thatipamula (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543208#comment-14543208
 ] 

Akhil Thatipamula commented on SPARK-7012:
--

Can any one eloberate on this issue??

 Add support for NOT NULL modifier for column definitions on DDLParser
 -

 Key: SPARK-7012
 URL: https://issues.apache.org/jira/browse/SPARK-7012
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.3.0
Reporter: Santiago M. Mola
Priority: Minor
  Labels: easyfix

 Add support for NOT NULL modifier for column definitions on DDLParser. This 
 would add support for the following syntax:
 CREATE TEMPORARY TABLE (field INTEGER NOT NULL) ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org