[jira] [Resolved] (SPARK-20708) Make `addExclusionRules` up-to-date

2017-05-31 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz resolved SPARK-20708. - Resolution: Fixed Assignee: Dongjoon Hyun Fix Version/s: 2.3.0 > M

[jira] [Commented] (SPARK-20708) Make `addExclusionRules` up-to-date

2017-05-31 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16032475#comment-16032475 ] Burak Yavuz commented on SPARK-20708: - Resolved by https://github.com/apache/spark/pull/17947 > M

Re: Structured Streaming from Parquet

2017-05-25 Thread Burak Yavuz
Hi Paul, >From what you're describing, it seems that stream1 is possibly generating tons of small files and stream2 is OOMing because it tries to maintain an in-memory list of files. Some notes/questions: 1. Parquet files are splittable, therefore having large parquet files shouldn't be a

[phpMyAdmin Git] [phpmyadmin/localized_docs] 3a556c: Translated using Weblate (Turkish)

2017-05-24 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: 3a556caa9429ae024a9ef84fb7abf147adf146f3 https://github.com/phpmyadmin/localized_docs/commit/3a556caa9429ae024a9ef84fb7abf147adf146f3 Author: Burak Yavuz <hitowerdi...@hotmail.com> Date:

Re: couple naive questions on Spark Structured Streaming

2017-05-22 Thread Burak Yavuz
Hi Kant, > > > 1. Can we use Spark Structured Streaming for stateless transformations > just like we would do with DStreams or Spark Structured Streaming is only > meant for stateful computations? > Of course you can do stateless transformations. Any map, filter, select, type of transformation

[phpMyAdmin Git] [phpmyadmin/localized_docs] d74c4b: Translated using Weblate (Turkish)

2017-05-18 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: d74c4b4691e26179f1de914d76b81d25c93ea214 https://github.com/phpmyadmin/localized_docs/commit/d74c4b4691e26179f1de914d76b81d25c93ea214 Author: Burak Yavuz <hitowerdi...@hotmail.com> Date:

[phpMyAdmin Git] [phpmyadmin/phpmyadmin] 37b73a: Translated using Weblate (Turkish)

2017-05-17 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 37b73a2457778862b6539cdd8fa52511aa9132db https://github.com/phpmyadmin/phpmyadmin/commit/37b73a2457778862b6539cdd8fa52511aa9132db Author: Burak Yavuz <hitowerdi...@hotmail.com> Date: 2017

[phpMyAdmin Git] [phpmyadmin/phpmyadmin] 445677: Translated using Weblate (Turkish)

2017-05-17 Thread Burak Yavuz
Branch: refs/heads/QA_4_7 Home: https://github.com/phpmyadmin/phpmyadmin Commit: 445677ecd9de799ab4d0e3b695ccf6a72f1cfe3d https://github.com/phpmyadmin/phpmyadmin/commit/445677ecd9de799ab4d0e3b695ccf6a72f1cfe3d Author: Burak Yavuz <hitowerdi...@hotmail.com> Date: 2017

[jira] [Resolved] (SPARK-20140) Remove hardcoded kinesis retry wait and max retries

2017-05-16 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz resolved SPARK-20140. - Resolution: Fixed Fix Version/s: 2.3.0 2.2.1 > Remove hardco

[jira] [Assigned] (SPARK-20140) Remove hardcoded kinesis retry wait and max retries

2017-05-16 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz reassigned SPARK-20140: --- Assignee: Yash Sharma > Remove hardcoded kinesis retry wait and max retr

[jira] [Commented] (SPARK-20140) Remove hardcoded kinesis retry wait and max retries

2017-05-16 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013191#comment-16013191 ] Burak Yavuz commented on SPARK-20140: - resolved by https://github.com/apache/spark/pull/17467

[jira] [Created] (SPARK-20775) from_json should also have an API where the schema is specified with a string

2017-05-16 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-20775: --- Summary: from_json should also have an API where the schema is specified with a string Key: SPARK-20775 URL: https://issues.apache.org/jira/browse/SPARK-20775 Project

[phpMyAdmin Git] [phpmyadmin/localized_docs] 252c86: Translated using Weblate (Turkish)

2017-05-10 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: 252c862dd57bb83b08a72a76939986154bb43350 https://github.com/phpmyadmin/localized_docs/commit/252c862dd57bb83b08a72a76939986154bb43350 Author: Burak Yavuz <hitowerdi...@hotmail.com> Date:

Re: Why does dataset.union fails but dataset.rdd.union execute correctly?

2017-05-08 Thread Burak Yavuz
tly the same schema, but > one side support null and the other doesn't, this exception (in union > dataset) will be thrown? > > > > 2017-05-08 16:41 GMT-03:00 Burak Yavuz <brk...@gmail.com>: > >> I also want to add that generally these may be caused by the >> `nu

Re: Why does dataset.union fails but dataset.rdd.union execute correctly?

2017-05-08 Thread Burak Yavuz
I also want to add that generally these may be caused by the `nullability` field in the schema. On Mon, May 8, 2017 at 12:25 PM, Shixiong(Ryan) Zhu wrote: > This is because RDD.union doesn't check the schema, so you won't see the > problem unless you run RDD and hit

[jira] [Commented] (SPARK-20571) Flaky SparkR StructuredStreaming tests

2017-05-05 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15998680#comment-15998680 ] Burak Yavuz commented on SPARK-20571: - Thanks! > Flaky SparkR StructuredStreaming te

[jira] [Resolved] (SPARK-20441) Within the same streaming query, one StreamingRelation should only be transformed to one StreamingExecutionRelation

2017-05-03 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz resolved SPARK-20441. - Resolution: Fixed Resolved with https://github.com/apache/spark/pull/17735 > Within the s

[jira] [Closed] (SPARK-20432) Unioning two identical Streaming DataFrames fails during attribute resolution

2017-05-03 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz closed SPARK-20432. --- Resolution: Duplicate > Unioning two identical Streaming DataFrames fails during attrib

[jira] [Commented] (SPARK-20571) Flaky SparkR StructuredStreaming tests

2017-05-02 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994185#comment-15994185 ] Burak Yavuz commented on SPARK-20571: - cc [~felixcheung] > Flaky SparkR StructuredStreaming te

[jira] [Created] (SPARK-20571) Flaky SparkR StructuredStreaming tests

2017-05-02 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-20571: --- Summary: Flaky SparkR StructuredStreaming tests Key: SPARK-20571 URL: https://issues.apache.org/jira/browse/SPARK-20571 Project: Spark Issue Type: Test

[jira] [Created] (SPARK-20549) java.io.CharConversionException: Invalid UTF-32 in JsonToStructs

2017-05-01 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-20549: --- Summary: java.io.CharConversionException: Invalid UTF-32 in JsonToStructs Key: SPARK-20549 URL: https://issues.apache.org/jira/browse/SPARK-20549 Project: Spark

[jira] [Resolved] (SPARK-20496) KafkaWriter Uses Unanalyzed Logical Plan

2017-04-28 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz resolved SPARK-20496. - Resolution: Fixed Fix Version/s: 2.2.0 2.1.2 Resolved with https

[jira] [Assigned] (SPARK-20496) KafkaWriter Uses Unanalyzed Logical Plan

2017-04-28 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz reassigned SPARK-20496: --- Assignee: Bill Chambers > KafkaWriter Uses Unanalyzed Logical P

[jira] [Created] (SPARK-20432) Unioning two identical Streaming DataFrames fails during attribute resolution

2017-04-21 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-20432: --- Summary: Unioning two identical Streaming DataFrames fails during attribute resolution Key: SPARK-20432 URL: https://issues.apache.org/jira/browse/SPARK-20432 Project

[phpMyAdmin Git] [phpmyadmin/phpmyadmin] 39fc41: Translated using Weblate (Turkish)

2017-04-12 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 39fc41d15c6b2bfd566a7944d1a762b98f443d19 https://github.com/phpmyadmin/phpmyadmin/commit/39fc41d15c6b2bfd566a7944d1a762b98f443d19 Author: Burak Yavuz <hitowerdi...@hotmail.com> Date: 2017

[jira] [Created] (SPARK-20301) Flakiness in StreamingAggregationSuite

2017-04-11 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-20301: --- Summary: Flakiness in StreamingAggregationSuite Key: SPARK-20301 URL: https://issues.apache.org/jira/browse/SPARK-20301 Project: Spark Issue Type: Test

[jira] [Updated] (SPARK-20301) Flakiness in StreamingAggregationSuite

2017-04-11 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-20301: Labels: flaky-test (was: ) > Flakiness in StreamingAggregationSu

[phpMyAdmin Git] [phpmyadmin/localized_docs] b339ec: Translated using Weblate (Turkish)

2017-04-08 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: b339ecdf2d4b08ba0946d8a58f23f805f8f031ef https://github.com/phpmyadmin/localized_docs/commit/b339ecdf2d4b08ba0946d8a58f23f805f8f031ef Author: Burak Yavuz <hitowerdi...@hotmail.com> Date:

[jira] [Created] (SPARK-20230) FetchFailedExceptions should invalidate file caches in MapOutputTracker even if newer stages are launched

2017-04-05 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-20230: --- Summary: FetchFailedExceptions should invalidate file caches in MapOutputTracker even if newer stages are launched Key: SPARK-20230 URL: https://issues.apache.org/jira/browse/SPARK

[phpMyAdmin Git] [phpmyadmin/localized_docs] 27e1d7: Translated using Weblate (Turkish)

2017-04-03 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: 27e1d79fc1be067ef3ab402398eaba7b1fdc96f5 https://github.com/phpmyadmin/localized_docs/commit/27e1d79fc1be067ef3ab402398eaba7b1fdc96f5 Author: Burak Yavuz <hitowerdi...@hotmail.com> Date:

[phpMyAdmin Git] [phpmyadmin/localized_docs] e00757: Translated using Weblate (Turkish)

2017-04-02 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: e0075726ad73bae46849758407ed6c1aacddbe51 https://github.com/phpmyadmin/localized_docs/commit/e0075726ad73bae46849758407ed6c1aacddbe51 Author: Burak Yavuz <hitowerdi...@hotmail.com> Date:

[phpMyAdmin Git] [phpmyadmin/localized_docs] 9e9bd8: Translated using Weblate (Turkish)

2017-03-31 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: 9e9bd81d8d9a69034f5cbb5af9ee22955f3b7d3d https://github.com/phpmyadmin/localized_docs/commit/9e9bd81d8d9a69034f5cbb5af9ee22955f3b7d3d Author: Burak Yavuz <hitowerdi...@hotmail.com> Date:

[jira] [Resolved] (SPARK-19911) Add builder interface for Kinesis DStreams

2017-03-24 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz resolved SPARK-19911. - Resolution: Fixed Assignee: Adam Budde Fix Version/s: 2.2.0 Target

Re: Spark 2.0.2 Dataset union() slowness vs RDD union?

2017-03-16 Thread Burak Yavuz
Hi Everett, IIRC we added unionAll in Spark 2.0 which is the same implementation as rdd union. The union in DataFrames with Spark 2.0 does dedeuplication, and that's why you should be seeing the slowdown. Best, Burak On Thu, Mar 16, 2017 at 4:14 PM, Everett Anderson

[phpMyAdmin Git] [phpmyadmin/localized_docs] ea2f5f: Translated using Weblate (Turkish)

2017-03-13 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: ea2f5f4500513e0856a4cb56f5954f46b01fb230 https://github.com/phpmyadmin/localized_docs/commit/ea2f5f4500513e0856a4cb56f5954f46b01fb230 Author: Burak Yavuz <hitowerdi...@hotmail.com> Date:

[jira] [Created] (SPARK-19886) reportDataLoss cause != null check is wrong for Structured Streaming KafkaSource

2017-03-09 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-19886: --- Summary: reportDataLoss cause != null check is wrong for Structured Streaming KafkaSource Key: SPARK-19886 URL: https://issues.apache.org/jira/browse/SPARK-19886

[jira] [Resolved] (SPARK-19813) maxFilesPerTrigger combo latestFirst may miss old files in combination with maxFileAge in FileStreamSource

2017-03-08 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz resolved SPARK-19813. - Resolution: Fixed Fix Version/s: 2.2.0 2.1.1 > maxFilesPerTrig

[jira] [Resolved] (SPARK-19304) Kinesis checkpoint recovery is 10x slow

2017-03-06 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz resolved SPARK-19304. - Resolution: Fixed Fix Version/s: 2.2.0 Target Version/s: 2.2.0 Resolved

[jira] [Assigned] (SPARK-19304) Kinesis checkpoint recovery is 10x slow

2017-03-06 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz reassigned SPARK-19304: --- Assignee: Gaurav Shah > Kinesis checkpoint recovery is 10x s

[jira] [Resolved] (SPARK-19595) from_json produces only a single row when input is a json array

2017-03-05 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz resolved SPARK-19595. - Resolution: Fixed Fix Version/s: 2.2.0 Resolved by https://github.com/apache/spark/pull

[jira] [Assigned] (SPARK-19595) from_json produces only a single row when input is a json array

2017-03-05 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz reassigned SPARK-19595: --- Assignee: Hyukjin Kwon > from_json produces only a single row when input is a json ar

[jira] [Created] (SPARK-19813) maxFilesPerTrigger combo latestFirst may miss old files in combination with maxFileAge in FileStreamSource

2017-03-03 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-19813: --- Summary: maxFilesPerTrigger combo latestFirst may miss old files in combination with maxFileAge in FileStreamSource Key: SPARK-19813 URL: https://issues.apache.org/jira/browse

[phpMyAdmin Git] [phpmyadmin/phpmyadmin] b17a0f: Translated using Weblate (Turkish)

2017-03-01 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: b17a0f91cc23bf8800c4aa715d5a7c9a050de41a https://github.com/phpmyadmin/phpmyadmin/commit/b17a0f91cc23bf8800c4aa715d5a7c9a050de41a Author: Burak Yavuz <hitowerdi...@hotmail.com> Date: 2017

[jira] [Created] (SPARK-19774) StreamExecution should call stop() on sources when a stream fails

2017-02-28 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-19774: --- Summary: StreamExecution should call stop() on sources when a stream fails Key: SPARK-19774 URL: https://issues.apache.org/jira/browse/SPARK-19774 Project: Spark

[phpMyAdmin Git] [phpmyadmin/localized_docs] bb684c: Translated using Weblate (Turkish)

2017-02-23 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: bb684c9e7108bb04ac7421b6126f1f0b05209c94 https://github.com/phpmyadmin/localized_docs/commit/bb684c9e7108bb04ac7421b6126f1f0b05209c94 Author: Burak Yavuz <hitowerdi...@hotmail.com> Date:

[jira] [Resolved] (SPARK-19405) Add support to KinesisUtils for cross-account Kinesis reads via STS

2017-02-22 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz resolved SPARK-19405. - Resolution: Fixed Assignee: Adam Budde Fix Version/s: 2.2.0 Resolved with: https

[jira] [Created] (SPARK-19637) add to_json APIs to SQL

2017-02-16 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-19637: --- Summary: add to_json APIs to SQL Key: SPARK-19637 URL: https://issues.apache.org/jira/browse/SPARK-19637 Project: Spark Issue Type: New Feature

Re: welcoming Takuya Ueshin as a new Apache Spark committer

2017-02-13 Thread Burak Yavuz
Congrats Takuya! On Mon, Feb 13, 2017 at 2:17 PM, Dilip Biswal wrote: > Congratulations, Takuya! > > Regards, > Dilip Biswal > Tel: 408-463-4980 <(408)%20463-4980> > dbis...@us.ibm.com > > > > - Original message - > From: Takeshi Yamamuro >

[jira] [Resolved] (SPARK-19542) Delete the temp checkpoint if a query is stopped without errors

2017-02-13 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz resolved SPARK-19542. - Resolution: Fixed Fix Version/s: 2.2.0 2.1.1 > Delete the t

[jira] [Created] (SPARK-19543) from_json fails when the input row is empty

2017-02-09 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-19543: --- Summary: from_json fails when the input row is empty Key: SPARK-19543 URL: https://issues.apache.org/jira/browse/SPARK-19543 Project: Spark Issue Type: Bug

Re: [Structured Streaming] Using File Sink to store to hive table.

2017-02-06 Thread Burak Yavuz
le. How can I > do that? > > 2017-02-06 14:25 GMT-08:00 Burak Yavuz <brk...@gmail.com>: > >> Hi Egor, >> >> Structured Streaming handles all of its metadata itself, which files are >> actually valid, etc. You may use the "create table" syntax in SQL

Re: [Structured Streaming] Using File Sink to store to hive table.

2017-02-06 Thread Burak Yavuz
Hi Egor, Structured Streaming handles all of its metadata itself, which files are actually valid, etc. You may use the "create table" syntax in SQL to treat it like a hive table, but it will handle all partitioning information in its own metadata log. Is there a specific reason that you want to

[phpMyAdmin Git] [phpmyadmin/phpmyadmin] 98ad19: Translated using Weblate (Turkish)

2017-02-01 Thread Burak Yavuz
Branch: refs/heads/QA_4_7 Home: https://github.com/phpmyadmin/phpmyadmin Commit: 98ad19576a8fb0fa588a6b060b88a7ab514b23e9 https://github.com/phpmyadmin/phpmyadmin/commit/98ad19576a8fb0fa588a6b060b88a7ab514b23e9 Author: Burak Yavuz <hitowerdi...@hotmail.com> Date: 2017

Re: eager? in dataframe's checkpoint

2017-01-31 Thread Burak Yavuz
ount happens most likely rdd.isCheckpointed > will be false, and the count will be on the rdd before it was checkpointed. > what is the benefit of that? > > > On Thu, Jan 26, 2017 at 11:19 PM, Burak Yavuz <brk...@gmail.com> wrote: > >> Hi, >> >> One of the goal

Re: eager? in dataframe's checkpoint

2017-01-26 Thread Burak Yavuz
Hi, One of the goals of checkpointing is to cut the RDD lineage. Otherwise you run into StackOverflowExceptions. If you eagerly checkpoint, you basically cut the lineage there, and the next operations all depend on the checkpointed DataFrame. If you don't checkpoint, you continue to build the

[jira] [Updated] (SPARK-18218) Optimize BlockMatrix multiplication, which may cause OOM and low parallelism usage problem in several cases

2017-01-26 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-18218: Assignee: Weichen Xu > Optimize BlockMatrix multiplication, which may cause OOM and

[jira] [Resolved] (SPARK-18218) Optimize BlockMatrix multiplication, which may cause OOM and low parallelism usage problem in several cases

2017-01-26 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz resolved SPARK-18218. - Resolution: Implemented Fix Version/s: 2.2.0 Resolved by https://github.com/apache/spark

[jira] [Updated] (SPARK-19378) StateOperator metrics should still return the total number of rows in state even if there was no data for a trigger

2017-01-26 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-19378: Description: If you have a StreamingDataFrame with an aggregation, we report a metric called

[jira] [Created] (SPARK-19378) StateOperator metrics should still return the total number of rows in state even if there was no data for a trigger

2017-01-26 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-19378: --- Summary: StateOperator metrics should still return the total number of rows in state even if there was no data for a trigger Key: SPARK-19378 URL: https://issues.apache.org/jira

Re: Java heap error during matrix multiplication

2017-01-26 Thread Burak Yavuz
Hi, Have you tried creating more column blocks? BlockMatrix matrix = cmatrix.toBlockMatrix(100, 100); for example. Is your data randomly spread out, or do you generally have clusters of data points together? On Wed, Jan 25, 2017 at 4:23 AM, Petr Shestov wrote: > Hi

[jira] [Updated] (SPARK-18020) Kinesis receiver does not snapshot when shard completes

2017-01-25 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-18020: Assignee: Takeshi Yamamuro > Kinesis receiver does not snapshot when shard comple

[jira] [Resolved] (SPARK-18020) Kinesis receiver does not snapshot when shard completes

2017-01-25 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz resolved SPARK-18020. - Resolution: Fixed Fix Version/s: 2.2.0 Resolved by https://github.com/apache/spark/pull

Re: How to make the state in a streaming application idempotent?

2017-01-25 Thread Burak Yavuz
deshpande <deshpandesh...@gmail.com> wrote: > Thanks Burak. But with BloomFilter, won't I be getting a false poisitve? > > On Wed, Jan 25, 2017 at 11:28 AM, Burak Yavuz <brk...@gmail.com> wrote: > >> I noticed that 1 wouldn't be a problem, because you'll save t

Re: How to make the state in a streaming application idempotent?

2017-01-25 Thread Burak Yavuz
gave me 2 solutions > 1. Bloom filter --> problem in repopulating the bloom filter on restarts > 2. keeping the state of the unique ids > > Please elaborate on 2. > > > > On Wed, Jan 25, 2017 at 10:53 AM, Burak Yavuz <brk...@gmail.com> wrote: > >> I don't

Re: How to make the state in a streaming application idempotent?

2017-01-25 Thread Burak Yavuz
gt; Thanks > > On Wed, Jan 25, 2017 at 9:13 AM, Burak Yavuz <brk...@gmail.com> wrote: > >> Off the top of my head... (Each may have it's own issues) >> >> If upstream you add a uniqueId to all your records, then you may use a >> BloomFilter to appro

Re: How to make the state in a streaming application idempotent?

2017-01-25 Thread Burak Yavuz
Off the top of my head... (Each may have it's own issues) If upstream you add a uniqueId to all your records, then you may use a BloomFilter to approximate if you've seen a row before. The problem I can see with that approach is how to repopulate the bloom filter on restarts. If you are certain

[phpMyAdmin Git] [phpmyadmin/phpmyadmin] 5fbd21: Translated using Weblate (Turkish)

2017-01-25 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 5fbd21a0bd453f1c5393cc5f143a28bf36280e02 https://github.com/phpmyadmin/phpmyadmin/commit/5fbd21a0bd453f1c5393cc5f143a28bf36280e02 Author: Burak Yavuz <hitowerdi...@hotmail.com> Date: 2017

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Burak Yavuz
Thank you very much everyone! Hoping to help out the community as much as I can! Best, Burak On Tue, Jan 24, 2017 at 2:29 PM, Jacek Laskowski wrote: > Wow! At long last. Congrats Burak and Holden! > > p.s. I was a bit worried that the process of accepting new committers > is

[phpMyAdmin Git] [phpmyadmin/phpmyadmin] 60b77e: Translated using Weblate (Turkish)

2017-01-23 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 60b77eea6e73c67199a1f11e7d83c9c6212f08c3 https://github.com/phpmyadmin/phpmyadmin/commit/60b77eea6e73c67199a1f11e7d83c9c6212f08c3 Author: Burak Yavuz <hitowerdi...@hotmail.com> Date: 2017

[phpMyAdmin Git] [phpmyadmin/phpmyadmin] d121c6: Translated using Weblate (Turkish)

2017-01-23 Thread Burak Yavuz
Branch: refs/heads/QA_4_6 Home: https://github.com/phpmyadmin/phpmyadmin Commit: d121c692265078fb75b19e9b1f6eb49aae54c9ab https://github.com/phpmyadmin/phpmyadmin/commit/d121c692265078fb75b19e9b1f6eb49aae54c9ab Author: Burak Yavuz <hitowerdi...@hotmail.com> Date: 2017

[phpMyAdmin Git] [phpmyadmin/localized_docs] 9e6df7: Translated using Weblate (Turkish)

2017-01-18 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: 9e6df7a487803450187d60e21d3c8fe1341fca8b https://github.com/phpmyadmin/localized_docs/commit/9e6df7a487803450187d60e21d3c8fe1341fca8b Author: Burak Yavuz <hitowerdi...@hotmail.com> Date:

Re: [SQL][SPARK-14160] Maximum interval for o.a.s.sql.functions.window

2017-01-18 Thread Burak Yavuz
Hi Maciej, I believe it would be useful to either fix the documentation or fix the implementation. I'll leave it to the community to comment on. The code right now disallows intervals provided in months and years, because they are not a "consistently" fixed amount of time. A month can be 28, 29,

[phpMyAdmin Git] [phpmyadmin/localized_docs] 8004d4: Translated using Weblate (Turkish)

2017-01-17 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: 8004d4ceb72c7446a3dccae369a0912c8948b72c https://github.com/phpmyadmin/localized_docs/commit/8004d4ceb72c7446a3dccae369a0912c8948b72c Author: Burak Yavuz <hitowerdi...@hotmail.com> Date:

[phpMyAdmin Git] [phpmyadmin/localized_docs] d580dc: Translated using Weblate (Turkish)

2017-01-11 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: d580dc246ce2e159cf587637bc0d3cf5b06ad8b6 https://github.com/phpmyadmin/localized_docs/commit/d580dc246ce2e159cf587637bc0d3cf5b06ad8b6 Author: Burak Yavuz <hitowerdi...@hotmail.com> Date:

[phpMyAdmin Git] [phpmyadmin/localized_docs] c5fa88: Translated using Weblate (Turkish)

2017-01-09 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: c5fa88dc04a4a141da1be9af881a6f9b86575fdd https://github.com/phpmyadmin/localized_docs/commit/c5fa88dc04a4a141da1be9af881a6f9b86575fdd Author: Burak Yavuz <hitowerdi...@hotmail.com> Date:

[phpMyAdmin Git] [phpmyadmin/localized_docs] c2fa72: Translated using Weblate (Turkish)

2016-12-23 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: c2fa72dd06b4bd698b2d915b49c0745ef219f594 https://github.com/phpmyadmin/localized_docs/commit/c2fa72dd06b4bd698b2d915b49c0745ef219f594 Author: Burak Yavuz <hitowerdi...@hotmail.com> Date:

[phpMyAdmin Git] [phpmyadmin/phpmyadmin] 6fe9e6: Translated using Weblate (Turkish)

2016-12-21 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 6fe9e68e762e257c7dd3ca3997fcef59e148789f https://github.com/phpmyadmin/phpmyadmin/commit/6fe9e68e762e257c7dd3ca3997fcef59e148789f Author: Burak Yavuz <hitowerdi...@hotmail.com> Date: 2016

[jira] [Updated] (SPARK-18952) regex strings not properly escaped in codegen for aggregations

2016-12-20 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-18952: Summary: regex strings not properly escaped in codegen for aggregations (was: regex strings

[jira] [Created] (SPARK-18952) regex strings not properly escaped in codegen

2016-12-20 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-18952: --- Summary: regex strings not properly escaped in codegen Key: SPARK-18952 URL: https://issues.apache.org/jira/browse/SPARK-18952 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-18927) MemorySink for StructuredStreaming can't recover from checkpoint if location is provided in conf

2016-12-19 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-18927: --- Summary: MemorySink for StructuredStreaming can't recover from checkpoint if location is provided in conf Key: SPARK-18927 URL: https://issues.apache.org/jira/browse/SPARK-18927

[jira] [Created] (SPARK-18900) Flaky Test: StateStoreSuite.maintenance

2016-12-16 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-18900: --- Summary: Flaky Test: StateStoreSuite.maintenance Key: SPARK-18900 URL: https://issues.apache.org/jira/browse/SPARK-18900 Project: Spark Issue Type: Test

[jira] [Updated] (SPARK-18888) partitionBy in DataStreamWriter in Python throws _to_seq not defined

2016-12-15 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-1: Affects Version/s: (was: 2.1.0) 2.0.2 > partitionBy in DataStreamWri

[jira] [Created] (SPARK-18888) partitionBy in DataStreamWriter in Python throws _to_seq not defined

2016-12-15 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-1: --- Summary: partitionBy in DataStreamWriter in Python throws _to_seq not defined Key: SPARK-1 URL: https://issues.apache.org/jira/browse/SPARK-1 Project: Spark

[jira] [Created] (SPARK-18868) Flaky Test: StreamingQueryListenerSuite

2016-12-14 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-18868: --- Summary: Flaky Test: StreamingQueryListenerSuite Key: SPARK-18868 URL: https://issues.apache.org/jira/browse/SPARK-18868 Project: Spark Issue Type: Test

[phpMyAdmin Git] [phpmyadmin/localized_docs] 917856: Translated using Weblate (Turkish)

2016-12-14 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: 917856aa5d2fde4a7be01719d802a920df96cb4a https://github.com/phpmyadmin/localized_docs/commit/917856aa5d2fde4a7be01719d802a920df96cb4a Author: Burak Yavuz <hitowerdi...@hotmail.com> Date:

[phpMyAdmin Git] [phpmyadmin/phpmyadmin] e73a59: Translated using Weblate (Turkish)

2016-12-13 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: e73a59f18e949c7ee8ea2e15bbd95cfe785db674 https://github.com/phpmyadmin/phpmyadmin/commit/e73a59f18e949c7ee8ea2e15bbd95cfe785db674 Author: Burak Yavuz <hitowerdi...@hotmail.com> Date: 2016

[phpMyAdmin Git] [phpmyadmin/phpmyadmin] 2e561a: Translated using Weblate (Turkish)

2016-12-13 Thread Burak Yavuz
Branch: refs/heads/QA_4_6 Home: https://github.com/phpmyadmin/phpmyadmin Commit: 2e561a296e29df9c2f57d959a7b8c519921dcd25 https://github.com/phpmyadmin/phpmyadmin/commit/2e561a296e29df9c2f57d959a7b8c519921dcd25 Author: Burak Yavuz <hitowerdi...@hotmail.com> Date: 2016

[jira] [Created] (SPARK-18811) Stream Source resolution should happen in StreamExecution thread, not main thread

2016-12-09 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-18811: --- Summary: Stream Source resolution should happen in StreamExecution thread, not main thread Key: SPARK-18811 URL: https://issues.apache.org/jira/browse/SPARK-18811

Re: Spark Streaming - join streaming and static data

2016-12-06 Thread Burak Yavuz
Hi Daniela, This is trivial with Structured Streaming. If your Kafka cluster is 0.10.0 or above, you may use Spark 2.0.2 to create a Streaming DataFrame from Kafka, and then also create a DataFrame using the JDBC connection, and you may join those. In Spark 2.1, there's support for a function

[jira] [Commented] (SPARK-18475) Be able to provide higher parallelization for StructuredStreaming Kafka Source

2016-11-29 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15706788#comment-15706788 ] Burak Yavuz commented on SPARK-18475: - I'd be happy to share performance results. You're right, I

[jira] [Updated] (SPARK-18634) Corruption and Correctness issues with exploding Python UDFs

2016-11-29 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-18634: Description: There are some weird issues with exploding Python UDFs in SparkSQL. There are 2

[jira] [Updated] (SPARK-18634) Corruption and Correctness issues with exploding Python UDFs

2016-11-29 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-18634: Description: There are some weird issues with exploding Python UDFs in SparkSQL. There are 2

[jira] [Updated] (SPARK-18634) Corruption and Correctness issues with exploding Python UDFs

2016-11-29 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-18634: Summary: Corruption and Correctness issues with exploding Python UDFs (was: Issues with exploding

[jira] [Updated] (SPARK-18634) Corruption and Correctness issues with exploding Python UDFs

2016-11-29 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-18634: Description: There are some weird issues with exploding Python UDFs in SparkSQL. There are 2

[jira] [Created] (SPARK-18634) Issues with exploding Python UDFs

2016-11-29 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-18634: --- Summary: Issues with exploding Python UDFs Key: SPARK-18634 URL: https://issues.apache.org/jira/browse/SPARK-18634 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-18407) Inferred partition columns cause assertion error

2016-11-25 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15696337#comment-15696337 ] Burak Yavuz commented on SPARK-18407: - This is also resolved as part of https://issues.apache.org

[jira] [Commented] (SPARK-18510) Partition schema inference corrupts data

2016-11-20 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15681677#comment-15681677 ] Burak Yavuz commented on SPARK-18510: - No. Working on a separate fix > Partition schema infere

[jira] [Commented] (SPARK-18510) Partition schema inference corrupts data

2016-11-19 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15680328#comment-15680328 ] Burak Yavuz commented on SPARK-18510: - cc [~r...@databricks.com] I marked this as a blocker

[jira] [Updated] (SPARK-18510) Partition schema inference corrupts data

2016-11-19 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-18510: Description: Not sure if this is a regression from 2.0 to 2.1. I was investigating

[jira] [Created] (SPARK-18510) Partition schema inference corrupts data

2016-11-19 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-18510: --- Summary: Partition schema inference corrupts data Key: SPARK-18510 URL: https://issues.apache.org/jira/browse/SPARK-18510 Project: Spark Issue Type: Bug

<    1   2   3   4   5   6   7   8   9   10   >