[GitHub] spark issue #21889: [SPARK-4502][SQL] Parquet nested column pruning - founda...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21889 Just FYI, we are unable to merge it if it has a correctness bug. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21103: [SPARK-23915][SQL] Add array_except function
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21103 **[Test build #93654 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93654/testReport)** for PR 21103 at commit [`e902974`](https://github.com/apache/spark/commit/e9029746a9cbc204d043cb7a0f9c1c3285284b54). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21857: [SPARK-21274][SQL] Implement EXCEPT ALL clause.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21857 **[Test build #93656 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93656/testReport)** for PR 21857 at commit [`1f107aa`](https://github.com/apache/spark/commit/1f107aaa1fb4e6f261c1720058877b943c46706d). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21837: [SPARK-24881][SQL] New Avro option - compression
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21837 **[Test build #93658 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93658/testReport)** for PR 21837 at commit [`5f83902`](https://github.com/apache/spark/commit/5f83902e2876745f8be245681e7cb41d69421778). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21879: [SPARK-24927][BUILD][BRANCH-2.3] The scope of snappy-jav...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21879 **[Test build #93655 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93655/testReport)** for PR 21879 at commit [`93c34da`](https://github.com/apache/spark/commit/93c34da713136eb7b4ed8bb8775353c8219efa22). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21103: [SPARK-23915][SQL] Add array_except function
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21103 **[Test build #93653 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93653/testReport)** for PR 21103 at commit [`4d01c98`](https://github.com/apache/spark/commit/4d01c9848e021006e2412ebb2db3e37782b5f41a). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver metrics
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21516 **[Test build #93657 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93657/testReport)** for PR 21516 at commit [`50c1c1e`](https://github.com/apache/spark/commit/50c1c1e810fa27480ae7e72640cc8f67b44a60f1). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21837: [SPARK-24881][SQL] New Avro option - compression
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21837 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93658/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver metrics
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21516 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93657/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21879: [SPARK-24927][BUILD][BRANCH-2.3] The scope of snappy-jav...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21879 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21857: [SPARK-21274][SQL] Implement EXCEPT ALL clause.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21857 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93656/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21879: [SPARK-24927][BUILD][BRANCH-2.3] The scope of snappy-jav...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21879 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93655/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver metrics
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21516 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21837: [SPARK-24881][SQL] New Avro option - compression
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21837 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21857: [SPARK-21274][SQL] Implement EXCEPT ALL clause.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21857 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21103: [SPARK-23915][SQL] Add array_except function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21103 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93653/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21103: [SPARK-23915][SQL] Add array_except function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21103 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21103: [SPARK-23915][SQL] Add array_except function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21103 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21103: [SPARK-23915][SQL] Add array_except function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21103 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93654/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21837: [SPARK-24881][SQL] New Avro option - compression
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/21837 jenkins, retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21837: [SPARK-24881][SQL] New Avro option - compression
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21837 **[Test build #93659 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93659/testReport)** for PR 21837 at commit [`5f83902`](https://github.com/apache/spark/commit/5f83902e2876745f8be245681e7cb41d69421778). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21103: [SPARK-23915][SQL] Add array_except function
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/21103#discussion_r205685897 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -3805,3 +3799,233 @@ object ArrayUnion { new GenericArrayData(arrayBuffer) } } + +/** + * Returns an array of the elements in the intersect of x and y, without duplicates + */ +@ExpressionDescription( + usage = """ + _FUNC_(array1, array2) - Returns an array of the elements in array1 but not in array2, +without duplicates. + """, + examples = """ +Examples: + > SELECT _FUNC_(array(1, 2, 3), array(1, 3, 5)); + array(2) + """, + since = "2.4.0") +case class ArrayExcept(left: Expression, right: Expression) extends ArraySetLike { --- End diff -- WDYT? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21847: [SPARK-24855][SQL][EXTERNAL]: Built-in AVRO suppo...
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/21847#discussion_r205685498 --- Diff: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala --- @@ -165,16 +183,112 @@ class AvroSerializer(rootCatalystType: DataType, rootAvroType: Schema, nullable: result } - private def resolveNullableType(avroType: Schema, nullable: Boolean): Schema = { -if (nullable) { - // avro uses union to represent nullable type. - val fields = avroType.getTypes.asScala - assert(fields.length == 2) - val actualType = fields.filter(_.getType != NULL) - assert(actualType.length == 1) - actualType.head + // Resolve an Avro union against a supplied DataType, i.e. a LongType compared against + // a ["null", "long"] should return a schema of type Schema.Type.LONG + // This function also handles resolving a DataType against unions of 2 or more types, i.e. + // an IntType resolves against a ["int", "long", "null"] will correctly return a schema of + // type Schema.Type.LONG + private def resolveNullableType(avroType: Schema, catalystType: DataType, + nullable: Boolean): Schema = { +(nullable, avroType.getType) match { + case (false, Type.UNION) | (true, Type.UNION) => +// avro uses union to represent nullable type. +val fieldTypes = avroType.getTypes.asScala + +// If we're nullable, we need to have at least two types. Cases with more than two types +// are captured in test("read read-write, read-write w/ schema, read") w/ test.avro input +assert(fieldTypes.length >= 2) + +val actualType = catalystType match { + case NullType => fieldTypes.filter(_.getType == Type.NULL) + case BooleanType => fieldTypes.filter(_.getType == Type.BOOLEAN) + case ByteType => fieldTypes.filter(_.getType == Type.INT) + case BinaryType => +val at = fieldTypes.filter(x => x.getType == Type.BYTES || x.getType == Type.FIXED) +if (at.length > 1) { + throw new IncompatibleSchemaException( +s"Cannot resolve schema of ${catalystType} against union ${avroType.toString}") +} else { + at +} + case ShortType | IntegerType => fieldTypes.filter(_.getType == Type.INT) + case LongType => fieldTypes.filter(_.getType == Type.LONG) + case FloatType => fieldTypes.filter(_.getType == Type.FLOAT) + case DoubleType => fieldTypes.filter(_.getType == Type.DOUBLE) + case d: DecimalType => fieldTypes.filter(_.getType == Type.STRING) + case StringType => fieldTypes +.filter(x => x.getType == Type.STRING || x.getType == Type.ENUM) + case DateType => fieldTypes.filter(x => x.getType == Type.INT || x.getType == Type.LONG) + case TimestampType => fieldTypes.filter(_.getType == Type.LONG) + case ArrayType(et, containsNull) => +// Find array that matches the element type specified +fieldTypes.filter(x => x.getType == Type.ARRAY + && typeMatchesSchema(et, x.getElementType)) + case st: StructType => // Find the matching record! +val recordTypes = fieldTypes.filter(x => x.getType == Type.RECORD) +if (recordTypes.length > 1) { + throw new IncompatibleSchemaException( +"Unions of multiple record types are NOT supported with user-specified schema") +} +recordTypes + case MapType(kt, vt, valueContainsNull) => +// Find the map that matches the value type. Maps in Avro are always key type string +fieldTypes.filter(x => x.getType == Type.MAP && typeMatchesSchema(vt, x.getValueType)) + case other => +throw new IncompatibleSchemaException(s"Unexpected type: $other") +} + +assert(actualType.length == 1) +actualType.head + case (false, _) | (true, _) => avroType --- End diff -- case _ => avroType --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21847: [SPARK-24855][SQL][EXTERNAL]: Built-in AVRO suppo...
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/21847#discussion_r205685302 --- Diff: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala --- @@ -165,16 +183,112 @@ class AvroSerializer(rootCatalystType: DataType, rootAvroType: Schema, nullable: result } - private def resolveNullableType(avroType: Schema, nullable: Boolean): Schema = { -if (nullable) { - // avro uses union to represent nullable type. - val fields = avroType.getTypes.asScala - assert(fields.length == 2) - val actualType = fields.filter(_.getType != NULL) - assert(actualType.length == 1) - actualType.head + // Resolve an Avro union against a supplied DataType, i.e. a LongType compared against + // a ["null", "long"] should return a schema of type Schema.Type.LONG + // This function also handles resolving a DataType against unions of 2 or more types, i.e. + // an IntType resolves against a ["int", "long", "null"] will correctly return a schema of + // type Schema.Type.LONG + private def resolveNullableType(avroType: Schema, catalystType: DataType, --- End diff -- rename to `resolveUnionType`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21847: [SPARK-24855][SQL][EXTERNAL]: Built-in AVRO suppo...
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/21847#discussion_r205648911 --- Diff: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala --- @@ -87,17 +88,33 @@ class AvroSerializer(rootCatalystType: DataType, rootAvroType: Schema, nullable: case d: DecimalType => (getter, ordinal) => getter.getDecimal(ordinal, d.precision, d.scale).toString case StringType => -(getter, ordinal) => new Utf8(getter.getUTF8String(ordinal).getBytes) +(getter, ordinal) => + if (avroType.getType == Type.ENUM) { +new GenericData.EnumSymbol(avroType, getter.getUTF8String(ordinal).toString) + } else { +new Utf8(getter.getUTF8String(ordinal).getBytes) + } case BinaryType => -(getter, ordinal) => ByteBuffer.wrap(getter.getBinary(ordinal)) +(getter, ordinal) => + val data = getter.getBinary(ordinal) + if (avroType.getType == Type.FIXED) { +// Handles fixed-type fields in output schema. Test case is included in test.avro +// as it includes several fixed fields that would fail if we specify schema +// on-write without this condition +val fixed = new GenericData.Fixed(avroType) +fixed.bytes(data) +fixed + } else { +ByteBuffer.wrap(data) + } --- End diff -- This might be slow. In the executors, when each row is going to be serialized, the whole `if-else` will be executed again and agin to get a specialized converter. We can consider to resolve the specialized types earlier in driver by ```scala import org.apache.avro.generic.GenericData.{Fixed, EnumSymbol} ... case StringType => if (avroType.getType == Type.ENUM) { (getter, ordinal) => new EnumSymbol(avroType, getter.getUTF8String(ordinal).toString) } else { (getter, ordinal) => new Utf8(getter.getUTF8String(ordinal).getBytes) } case BinaryType => if (avroType.getType == Type.FIXED) { (getter, ordinal) => new Fixed(avroType, getter.getBinary(ordinal)) } else { (getter, ordinal) => ByteBuffer.wrap(getter.getBinary(ordinal)) } ``` so the returned lambda expression will not have any check on `FIXED` or `ENUM` types. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21847: [SPARK-24855][SQL][EXTERNAL]: Built-in AVRO suppo...
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/21847#discussion_r205685728 --- Diff: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala --- @@ -165,16 +183,112 @@ class AvroSerializer(rootCatalystType: DataType, rootAvroType: Schema, nullable: result } - private def resolveNullableType(avroType: Schema, nullable: Boolean): Schema = { -if (nullable) { - // avro uses union to represent nullable type. - val fields = avroType.getTypes.asScala - assert(fields.length == 2) - val actualType = fields.filter(_.getType != NULL) - assert(actualType.length == 1) - actualType.head + // Resolve an Avro union against a supplied DataType, i.e. a LongType compared against + // a ["null", "long"] should return a schema of type Schema.Type.LONG + // This function also handles resolving a DataType against unions of 2 or more types, i.e. + // an IntType resolves against a ["int", "long", "null"] will correctly return a schema of + // type Schema.Type.LONG + private def resolveNullableType(avroType: Schema, catalystType: DataType, + nullable: Boolean): Schema = { +(nullable, avroType.getType) match { + case (false, Type.UNION) | (true, Type.UNION) => +// avro uses union to represent nullable type. +val fieldTypes = avroType.getTypes.asScala + +// If we're nullable, we need to have at least two types. Cases with more than two types +// are captured in test("read read-write, read-write w/ schema, read") w/ test.avro input +assert(fieldTypes.length >= 2) --- End diff -- When it's non-nullable, is it possible to have `fieldTypes.length == 1`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21847: [SPARK-24855][SQL][EXTERNAL]: Built-in AVRO suppo...
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/21847#discussion_r205683257 --- Diff: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala --- @@ -148,7 +165,8 @@ class AvroSerializer(rootCatalystType: DataType, rootAvroType: Schema, nullable: val avroFields = avroStruct.getFields assert(avroFields.size() == catalystStruct.length) val fieldConverters = catalystStruct.zip(avroFields.asScala).map { - case (f1, f2) => newConverter(f1.dataType, resolveNullableType(f2.schema(), f1.nullable)) + case (f1, f2) => newConverter(f1.dataType, resolveNullableType( +f2.schema(), f1.dataType, f1.nullable)) --- End diff -- Nit, formating, ```scala case (f1, f2) => newConverter(f1.dataType, resolveNullableType(f2.schema(), f1.dataType, f1.nullable)) ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21879: [SPARK-24927][BUILD][BRANCH-2.3] The scope of snappy-jav...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21879 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21847: [SPARK-24855][SQL][EXTERNAL]: Built-in AVRO support shou...
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/21847 Since the data type in Spark is certain, why do we need to support output Avro schema like `["int", "long", "null"]`? Can we just forbid such usage by having a rule for the schema: If the Avro type is Union, it has at most two types and one of it is Null type. Otherwise things can be complicated. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21847: [SPARK-24855][SQL][EXTERNAL]: Built-in AVRO suppo...
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/21847#discussion_r205684257 --- Diff: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala --- @@ -165,16 +183,112 @@ class AvroSerializer(rootCatalystType: DataType, rootAvroType: Schema, nullable: result } - private def resolveNullableType(avroType: Schema, nullable: Boolean): Schema = { -if (nullable) { - // avro uses union to represent nullable type. - val fields = avroType.getTypes.asScala - assert(fields.length == 2) - val actualType = fields.filter(_.getType != NULL) - assert(actualType.length == 1) - actualType.head + // Resolve an Avro union against a supplied DataType, i.e. a LongType compared against + // a ["null", "long"] should return a schema of type Schema.Type.LONG + // This function also handles resolving a DataType against unions of 2 or more types, i.e. + // an IntType resolves against a ["int", "long", "null"] will correctly return a schema of + // type Schema.Type.LONG + private def resolveNullableType(avroType: Schema, catalystType: DataType, + nullable: Boolean): Schema = { +(nullable, avroType.getType) match { --- End diff -- Since the code is complicated and long, maybe it's easier to read with just old fashion `if-else`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21847: [SPARK-24855][SQL][EXTERNAL]: Built-in AVRO suppo...
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/21847#discussion_r205687174 --- Diff: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala --- @@ -165,16 +183,112 @@ class AvroSerializer(rootCatalystType: DataType, rootAvroType: Schema, nullable: result } - private def resolveNullableType(avroType: Schema, nullable: Boolean): Schema = { -if (nullable) { - // avro uses union to represent nullable type. - val fields = avroType.getTypes.asScala - assert(fields.length == 2) - val actualType = fields.filter(_.getType != NULL) - assert(actualType.length == 1) - actualType.head + // Resolve an Avro union against a supplied DataType, i.e. a LongType compared against + // a ["null", "long"] should return a schema of type Schema.Type.LONG + // This function also handles resolving a DataType against unions of 2 or more types, i.e. + // an IntType resolves against a ["int", "long", "null"] will correctly return a schema of + // type Schema.Type.LONG + private def resolveNullableType(avroType: Schema, catalystType: DataType, + nullable: Boolean): Schema = { +(nullable, avroType.getType) match { + case (false, Type.UNION) | (true, Type.UNION) => +// avro uses union to represent nullable type. +val fieldTypes = avroType.getTypes.asScala + +// If we're nullable, we need to have at least two types. Cases with more than two types +// are captured in test("read read-write, read-write w/ schema, read") w/ test.avro input +assert(fieldTypes.length >= 2) + +val actualType = catalystType match { + case NullType => fieldTypes.filter(_.getType == Type.NULL) + case BooleanType => fieldTypes.filter(_.getType == Type.BOOLEAN) + case ByteType => fieldTypes.filter(_.getType == Type.INT) + case BinaryType => +val at = fieldTypes.filter(x => x.getType == Type.BYTES || x.getType == Type.FIXED) +if (at.length > 1) { + throw new IncompatibleSchemaException( +s"Cannot resolve schema of ${catalystType} against union ${avroType.toString}") +} else { + at +} + case ShortType | IntegerType => fieldTypes.filter(_.getType == Type.INT) + case LongType => fieldTypes.filter(_.getType == Type.LONG) + case FloatType => fieldTypes.filter(_.getType == Type.FLOAT) + case DoubleType => fieldTypes.filter(_.getType == Type.DOUBLE) + case d: DecimalType => fieldTypes.filter(_.getType == Type.STRING) + case StringType => fieldTypes +.filter(x => x.getType == Type.STRING || x.getType == Type.ENUM) + case DateType => fieldTypes.filter(x => x.getType == Type.INT || x.getType == Type.LONG) + case TimestampType => fieldTypes.filter(_.getType == Type.LONG) + case ArrayType(et, containsNull) => +// Find array that matches the element type specified +fieldTypes.filter(x => x.getType == Type.ARRAY + && typeMatchesSchema(et, x.getElementType)) + case st: StructType => // Find the matching record! +val recordTypes = fieldTypes.filter(x => x.getType == Type.RECORD) +if (recordTypes.length > 1) { + throw new IncompatibleSchemaException( +"Unions of multiple record types are NOT supported with user-specified schema") +} +recordTypes + case MapType(kt, vt, valueContainsNull) => +// Find the map that matches the value type. Maps in Avro are always key type string +fieldTypes.filter(x => x.getType == Type.MAP && typeMatchesSchema(vt, x.getValueType)) + case other => +throw new IncompatibleSchemaException(s"Unexpected type: $other") +} + +assert(actualType.length == 1) --- End diff -- We need to show error message if the length is not 1. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21879: [SPARK-24927][BUILD][BRANCH-2.3] The scope of snappy-jav...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21879 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1397/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21879: [SPARK-24927][BUILD][BRANCH-2.3] The scope of snappy-jav...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21879 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21879: [SPARK-24927][BUILD][BRANCH-2.3] The scope of snappy-jav...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21879 **[Test build #93660 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93660/testReport)** for PR 21879 at commit [`93c34da`](https://github.com/apache/spark/commit/93c34da713136eb7b4ed8bb8775353c8219efa22). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21847: [SPARK-24855][SQL][EXTERNAL]: Built-in AVRO suppo...
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/21847#discussion_r205692778 --- Diff: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala --- @@ -165,16 +183,112 @@ class AvroSerializer(rootCatalystType: DataType, rootAvroType: Schema, nullable: result } - private def resolveNullableType(avroType: Schema, nullable: Boolean): Schema = { -if (nullable) { - // avro uses union to represent nullable type. - val fields = avroType.getTypes.asScala - assert(fields.length == 2) - val actualType = fields.filter(_.getType != NULL) - assert(actualType.length == 1) - actualType.head + // Resolve an Avro union against a supplied DataType, i.e. a LongType compared against + // a ["null", "long"] should return a schema of type Schema.Type.LONG + // This function also handles resolving a DataType against unions of 2 or more types, i.e. + // an IntType resolves against a ["int", "long", "null"] will correctly return a schema of + // type Schema.Type.LONG + private def resolveNullableType(avroType: Schema, catalystType: DataType, + nullable: Boolean): Schema = { +(nullable, avroType.getType) match { + case (false, Type.UNION) | (true, Type.UNION) => +// avro uses union to represent nullable type. +val fieldTypes = avroType.getTypes.asScala + +// If we're nullable, we need to have at least two types. Cases with more than two types +// are captured in test("read read-write, read-write w/ schema, read") w/ test.avro input +assert(fieldTypes.length >= 2) + +val actualType = catalystType match { + case NullType => fieldTypes.filter(_.getType == Type.NULL) + case BooleanType => fieldTypes.filter(_.getType == Type.BOOLEAN) + case ByteType => fieldTypes.filter(_.getType == Type.INT) + case BinaryType => +val at = fieldTypes.filter(x => x.getType == Type.BYTES || x.getType == Type.FIXED) +if (at.length > 1) { + throw new IncompatibleSchemaException( +s"Cannot resolve schema of ${catalystType} against union ${avroType.toString}") +} else { + at +} + case ShortType | IntegerType => fieldTypes.filter(_.getType == Type.INT) + case LongType => fieldTypes.filter(_.getType == Type.LONG) + case FloatType => fieldTypes.filter(_.getType == Type.FLOAT) + case DoubleType => fieldTypes.filter(_.getType == Type.DOUBLE) + case d: DecimalType => fieldTypes.filter(_.getType == Type.STRING) + case StringType => fieldTypes +.filter(x => x.getType == Type.STRING || x.getType == Type.ENUM) + case DateType => fieldTypes.filter(x => x.getType == Type.INT || x.getType == Type.LONG) + case TimestampType => fieldTypes.filter(_.getType == Type.LONG) + case ArrayType(et, containsNull) => +// Find array that matches the element type specified +fieldTypes.filter(x => x.getType == Type.ARRAY + && typeMatchesSchema(et, x.getElementType)) + case st: StructType => // Find the matching record! +val recordTypes = fieldTypes.filter(x => x.getType == Type.RECORD) +if (recordTypes.length > 1) { + throw new IncompatibleSchemaException( +"Unions of multiple record types are NOT supported with user-specified schema") +} +recordTypes + case MapType(kt, vt, valueContainsNull) => +// Find the map that matches the value type. Maps in Avro are always key type string +fieldTypes.filter(x => x.getType == Type.MAP && typeMatchesSchema(vt, x.getValueType)) + case other => +throw new IncompatibleSchemaException(s"Unexpected type: $other") +} + +assert(actualType.length == 1) --- End diff -- Can you elaborate when `actualType.length == 0` or `actualType.length > 1`? Is it possible that `catalystType` is `Int`, and `fieldTypes` only contains `Long`? Do we want to do the promotion? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21847: [SPARK-24855][SQL][EXTERNAL]: Built-in AVRO suppo...
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/21847#discussion_r205692946 --- Diff: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala --- @@ -165,16 +183,112 @@ class AvroSerializer(rootCatalystType: DataType, rootAvroType: Schema, nullable: result } - private def resolveNullableType(avroType: Schema, nullable: Boolean): Schema = { -if (nullable) { - // avro uses union to represent nullable type. - val fields = avroType.getTypes.asScala - assert(fields.length == 2) - val actualType = fields.filter(_.getType != NULL) - assert(actualType.length == 1) - actualType.head + // Resolve an Avro union against a supplied DataType, i.e. a LongType compared against + // a ["null", "long"] should return a schema of type Schema.Type.LONG + // This function also handles resolving a DataType against unions of 2 or more types, i.e. + // an IntType resolves against a ["int", "long", "null"] will correctly return a schema of + // type Schema.Type.LONG + private def resolveNullableType(avroType: Schema, catalystType: DataType, + nullable: Boolean): Schema = { +(nullable, avroType.getType) match { + case (false, Type.UNION) | (true, Type.UNION) => +// avro uses union to represent nullable type. +val fieldTypes = avroType.getTypes.asScala + +// If we're nullable, we need to have at least two types. Cases with more than two types +// are captured in test("read read-write, read-write w/ schema, read") w/ test.avro input +assert(fieldTypes.length >= 2) + +val actualType = catalystType match { + case NullType => fieldTypes.filter(_.getType == Type.NULL) + case BooleanType => fieldTypes.filter(_.getType == Type.BOOLEAN) + case ByteType => fieldTypes.filter(_.getType == Type.INT) + case BinaryType => +val at = fieldTypes.filter(x => x.getType == Type.BYTES || x.getType == Type.FIXED) +if (at.length > 1) { + throw new IncompatibleSchemaException( +s"Cannot resolve schema of ${catalystType} against union ${avroType.toString}") +} else { + at +} + case ShortType | IntegerType => fieldTypes.filter(_.getType == Type.INT) + case LongType => fieldTypes.filter(_.getType == Type.LONG) + case FloatType => fieldTypes.filter(_.getType == Type.FLOAT) + case DoubleType => fieldTypes.filter(_.getType == Type.DOUBLE) + case d: DecimalType => fieldTypes.filter(_.getType == Type.STRING) + case StringType => fieldTypes +.filter(x => x.getType == Type.STRING || x.getType == Type.ENUM) + case DateType => fieldTypes.filter(x => x.getType == Type.INT || x.getType == Type.LONG) + case TimestampType => fieldTypes.filter(_.getType == Type.LONG) + case ArrayType(et, containsNull) => +// Find array that matches the element type specified +fieldTypes.filter(x => x.getType == Type.ARRAY + && typeMatchesSchema(et, x.getElementType)) + case st: StructType => // Find the matching record! +val recordTypes = fieldTypes.filter(x => x.getType == Type.RECORD) +if (recordTypes.length > 1) { + throw new IncompatibleSchemaException( +"Unions of multiple record types are NOT supported with user-specified schema") +} +recordTypes + case MapType(kt, vt, valueContainsNull) => +// Find the map that matches the value type. Maps in Avro are always key type string +fieldTypes.filter(x => x.getType == Type.MAP && typeMatchesSchema(vt, x.getValueType)) + case other => +throw new IncompatibleSchemaException(s"Unexpected type: $other") +} + +assert(actualType.length == 1) +actualType.head + case (false, _) | (true, _) => avroType +} + } + + // Given a Schema and a DataType, do they match? + private def typeMatchesSchema(catalystType: DataType, avroSchema: Schema): Boolean = { +if (catalystType.isInstanceOf[StructType]) { + val avroFields = resolveNullableType(avroSchema, catalystType, +avroSchema.getType == Type.UNION) +.getFields + if (avroFields.size() == catalystType.asInstanceOf[StructType].length) { + catalystType.asInstanceOf[StructType].zip(avroFields.asScala).forall { + case (f1, f2) => typeMatchesSchema(f1.dataType, f2.schema) +
[GitHub] spark issue #21067: [SPARK-23980][K8S] Resilient Spark driver on Kubernetes
Github user baluchicken commented on the issue: https://github.com/apache/spark/pull/21067 Thanks for the responses, I learned a lot from this:) I am going to close this PR for now, and maybe collaborate on the Kubernetes ticket raised by this PR. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21067: [SPARK-23980][K8S] Resilient Spark driver on Kube...
Github user baluchicken closed the pull request at: https://github.com/apache/spark/pull/21067 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21889: [SPARK-4502][SQL] Parquet nested column pruning - founda...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21889 @gatorsmile, just for clarification, you mean some regressions about correctness bug in existing features, right? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21891: [SPARK-24931][CORE]CoarseGrainedExecutorBackend s...
GitHub user bingbai0912 opened a pull request: https://github.com/apache/spark/pull/21891 [SPARK-24931][CORE]CoarseGrainedExecutorBackend send wrong 'Reason' w… TaskSetManager## What changes were proposed in this pull request? When CoarseGrainedExecutorBackend find the executor not available, it will send a "RemoveExecutor" message of "ExecutorExited" instead "ExecutorLossReason". So it call tell driver whether is the executor "exitCausedByApp" which should be false. So when dirver(TaskSetManager) can "handleFailedTask" correctly to avoid task failed time up to the "maxTaskFailures" and finally cause job failed. ## How was this patch tested? tested in my own cluster You can merge this pull request into a Git repository by running: $ git pull https://github.com/bingbai0912/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21891.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21891 commit 3b3f224d6ac2dc3d3a0c21ed14502329af3cbae8 Author: baibing Date: 2018-07-27T07:49:50Z [SPARK-24931][CORE]CoarseGrainedExecutorBackend send wrong 'Reason' when executor exits which leading to job failed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21891: [SPARK-24931][CORE]CoarseGrainedExecutorBackend send wro...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21891 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21891: [SPARK-24931][CORE]CoarseGrainedExecutorBackend send wro...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21891 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21891: [SPARK-24931][CORE]CoarseGrainedExecutorBackend send wro...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21891 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20451: [SPARK-23146][WIP] Support client mode for Kubernetes in...
Github user echarles commented on the issue: https://github.com/apache/spark/pull/20451 See #21748 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20451: [SPARK-23146][WIP] Support client mode for Kubern...
Github user echarles closed the pull request at: https://github.com/apache/spark/pull/20451 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r205702898 --- Diff: python/pyspark/sql/functions.py --- @@ -2382,6 +2382,20 @@ def array_sort(col): return Column(sc._jvm.functions.array_sort(_to_java_column(col))) +@since(2.4) +def shuffle(col): +""" +Collection function: Generates a random permutation of the given array. + +.. note:: The function is non-deterministic because its results depends on order of rows which --- End diff -- typo: `results depends` found while reading this one. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r205703459 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -1184,6 +1184,110 @@ case class ArraySort(child: Expression) extends UnaryExpression with ArraySortLi override def prettyName: String = "array_sort" } +/** + * Returns a random permutation of the given array. + */ +@ExpressionDescription( + usage = "_FUNC_(array) - Returns a random permutation of the given array.", + examples = """ +Examples: + > SELECT _FUNC_(array(1, 20, 3, 5)); + [3, 1, 5, 20] + > SELECT _FUNC_(array(1, 20, null, 3)); + [20, null, 3, 1] + """, since = "2.4.0") --- End diff -- We could add `note` here too. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r205703558 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -3545,6 +3545,14 @@ object functions { */ def array_max(e: Column): Column = withExpr { ArrayMax(e.expr) } + /** + * Returns a random permutation of the given array. + * + * @group collection_funcs + * @since 2.4.0 --- End diff -- Shall we match the documentation here as well? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21802#discussion_r205704109 --- Diff: python/pyspark/sql/functions.py --- @@ -2382,6 +2382,20 @@ def array_sort(col): return Column(sc._jvm.functions.array_sort(_to_java_column(col))) +@since(2.4) +def shuffle(col): +""" +Collection function: Generates a random permutation of the given array. + +.. note:: The function is non-deterministic because its results depends on order of rows which +may be non-deterministic after a shuffle. + +:param col: name of column or expression --- End diff -- Python doctest looks missing. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21802: [SPARK-23928][SQL] Add shuffle collection function.
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21802 Looks good to me too --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21826: [SPARK-24872] Replace the symbol '||' of Or opera...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21826#discussion_r205713034 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/PredicateSuite.scala --- @@ -455,4 +456,10 @@ class PredicateSuite extends SparkFunSuite with ExpressionEvalHelper { interpreted.initialize(0) assert(interpreted.eval(new UnsafeRow())) } + + test("[SPARK-24872] Replace the symbol '||' of Or operator with 'or'") { --- End diff -- tiny nit: `[SPARK-24872] ` -> `SPARK-24872: ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21631: [SPARK-24645][SQL] Skip parsing when csvColumnPruning en...
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/21631 > do we still hit the bug when parsing csv data? I have checked uniVocity 2.7.2, there is no problem on this version. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21802: [SPARK-23928][SQL] Add shuffle collection function.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21802 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21802: [SPARK-23928][SQL] Add shuffle collection function.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21802 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1398/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21802: [SPARK-23928][SQL] Add shuffle collection function.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21802 **[Test build #93661 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93661/testReport)** for PR 21802 at commit [`4135690`](https://github.com/apache/spark/commit/4135690f2cf1eea375a1a4f1697c0ffdb7436627). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21631: [SPARK-24645][SQL] Skip parsing when csvColumnPruning en...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21631 @MaxGekk, thanks. mind opening a PR to upgrade? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21748: [SPARK-23146][K8S] Support client mode.
Github user skonto commented on a diff in the pull request: https://github.com/apache/spark/pull/21748#discussion_r205721972 --- Diff: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/ClientModeTestsSuite.scala --- @@ -0,0 +1,111 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.deploy.k8s.integrationtest + +import org.scalatest.concurrent.Eventually +import scala.collection.JavaConverters._ + +import org.apache.spark.deploy.k8s.integrationtest.KubernetesSuite.{k8sTestTag, INTERVAL, TIMEOUT} + +trait ClientModeTestsSuite { k8sSuite: KubernetesSuite => + + test("Run in client mode.", k8sTestTag) { +val labels = Map("spark-app-selector" -> driverPodName) +val driverPort = 7077 +val blockManagerPort = 1 +val driverService = testBackend + .getKubernetesClient + .services() + .inNamespace(kubernetesTestComponents.namespace) + .createNew() +.withNewMetadata() + .withName(s"$driverPodName-svc") + .endMetadata() +.withNewSpec() + .withClusterIP("None") + .withSelector(labels.asJava) + .addNewPort() +.withName("driver-port") +.withPort(driverPort) +.withNewTargetPort(driverPort) +.endPort() + .addNewPort() +.withName("block-manager") +.withPort(blockManagerPort) +.withNewTargetPort(blockManagerPort) +.endPort() + .endSpec() +.done() +try { + val driverPod = testBackend +.getKubernetesClient +.pods() +.inNamespace(kubernetesTestComponents.namespace) +.createNew() + .withNewMetadata() + .withName(driverPodName) + .withLabels(labels.asJava) + .endMetadata() +.withNewSpec() + .withServiceAccountName("default") --- End diff -- @mccheah if people use spark-rbac.yaml this will fail. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21844: Spark 24873
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21844 @hejiefang can you close this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21103: [SPARK-23915][SQL] Add array_except function
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21103#discussion_r205733831 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -3805,3 +3799,233 @@ object ArrayUnion { new GenericArrayData(arrayBuffer) } } + +/** + * Returns an array of the elements in the intersect of x and y, without duplicates + */ +@ExpressionDescription( + usage = """ + _FUNC_(array1, array2) - Returns an array of the elements in array1 but not in array2, +without duplicates. + """, + examples = """ +Examples: + > SELECT _FUNC_(array(1, 2, 3), array(1, 3, 5)); + array(2) + """, + since = "2.4.0") +case class ArrayExcept(left: Expression, right: Expression) extends ArraySetLike { --- End diff -- we can overwrite `dataType` while still extending `ComplexTypeMergingExpression` to use the checks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21882: [SPARK-24934][SQL] Explicitly whitelist supported...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21882#discussion_r205734835 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala --- @@ -183,6 +183,18 @@ case class InMemoryTableScanExec( private val stats = relation.partitionStatistics private def statsFor(a: Attribute) = stats.forAttribute(a) + // Currently, only use statistics from atomic types except binary type only. + private object ExtractableLiteral { +def unapply(expr: Expression): Option[Literal] = expr match { + case lit: Literal => lit.dataType match { +case BinaryType => None --- End diff -- can we also add test for binary type? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21882: [SPARK-24934][SQL] Explicitly whitelist supported types ...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21882 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21844: Spark 24873
Github user hejiefang closed the pull request at: https://github.com/apache/spark/pull/21844 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21816: [SPARK-24794][CORE] Driver launched through rest should ...
Github user bsikander commented on the issue: https://github.com/apache/spark/pull/21816 @vanzin Could you please have a look on this change? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21844: Spark 24873
Github user hejiefang commented on the issue: https://github.com/apache/spark/pull/21844 OK. Sorry, I didn't know I could close it --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21854: [SPARK-24896][SQL] Uuid should produce different values ...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21854 regardless of the implementation, is it expected to produce different UUID for different micro batches? Personally I think it's reasonable, micro batch and continuous execution should produce same result. cc @tdas @zsxwing @jose-torres --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21850: [SPARK-24892] [SQL] Simplify `CaseWhen` to `If` w...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21850#discussion_r205737337 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -414,6 +414,9 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper { // these branches can be pruned away val (h, t) = branches.span(_._1 != TrueLiteral) CaseWhen( h :+ t.head, None) + + case CaseWhen(Seq((cond, trueValue)), elseValue) => +If(cond, trueValue, elseValue.getOrElse(Literal(null, trueValue.dataType))) --- End diff -- > optimization rules in If which may not be implemented for CaseWhen case. shall we just implement more optimizer rules for CASE WHEN to cover all the cases? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20405: [SPARK-23229][SQL] Dataset.hint should use planWithBarri...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20405 I think this can be removed in favor of https://github.com/apache/spark/pull/21822 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21706: [SPARK-24702] Fix Unable to cast to calendar interval in...
Github user dmateusp commented on the issue: https://github.com/apache/spark/pull/21706 hey @HyukjinKwon thanks for coming back to me on this :) I'll close the PR now, and start a thread later today on the dev mailing list --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21706: [SPARK-24702] Fix Unable to cast to calendar inte...
Github user dmateusp closed the pull request at: https://github.com/apache/spark/pull/21706 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21837: [SPARK-24881][SQL] New Avro option - compression
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21837 **[Test build #93659 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93659/testReport)** for PR 21837 at commit [`5f83902`](https://github.com/apache/spark/commit/5f83902e2876745f8be245681e7cb41d69421778). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21879: [SPARK-24927][BUILD][BRANCH-2.3] The scope of snappy-jav...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21879 **[Test build #93660 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93660/testReport)** for PR 21879 at commit [`93c34da`](https://github.com/apache/spark/commit/93c34da713136eb7b4ed8bb8775353c8219efa22). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21837: [SPARK-24881][SQL] New Avro option - compression
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21837 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21837: [SPARK-24881][SQL] New Avro option - compression
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21837 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93659/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21879: [SPARK-24927][BUILD][BRANCH-2.3] The scope of snappy-jav...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21879 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21879: [SPARK-24927][BUILD][BRANCH-2.3] The scope of snappy-jav...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21879 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93660/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21732: [SPARK-24762][SQL] Aggregator should be able to use Opti...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21732 Again, can we always support `Option[Product]` with some special handling for top-level encoder expression? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21748: [SPARK-23146][K8S] Support client mode.
Github user ifilonenko commented on a diff in the pull request: https://github.com/apache/spark/pull/21748#discussion_r205752304 --- Diff: resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/ClientModeTestsSuite.scala --- @@ -0,0 +1,111 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.deploy.k8s.integrationtest + +import org.scalatest.concurrent.Eventually +import scala.collection.JavaConverters._ + +import org.apache.spark.deploy.k8s.integrationtest.KubernetesSuite.{k8sTestTag, INTERVAL, TIMEOUT} + +trait ClientModeTestsSuite { k8sSuite: KubernetesSuite => + + test("Run in client mode.", k8sTestTag) { +val labels = Map("spark-app-selector" -> driverPodName) +val driverPort = 7077 +val blockManagerPort = 1 +val driverService = testBackend + .getKubernetesClient + .services() + .inNamespace(kubernetesTestComponents.namespace) + .createNew() +.withNewMetadata() + .withName(s"$driverPodName-svc") + .endMetadata() +.withNewSpec() + .withClusterIP("None") + .withSelector(labels.asJava) + .addNewPort() +.withName("driver-port") +.withPort(driverPort) +.withNewTargetPort(driverPort) +.endPort() + .addNewPort() +.withName("block-manager") +.withPort(blockManagerPort) +.withNewTargetPort(blockManagerPort) +.endPort() + .endSpec() +.done() +try { + val driverPod = testBackend +.getKubernetesClient +.pods() +.inNamespace(kubernetesTestComponents.namespace) +.createNew() + .withNewMetadata() + .withName(driverPodName) + .withLabels(labels.asJava) + .endMetadata() +.withNewSpec() + .withServiceAccountName("default") --- End diff -- +1 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21809: [SPARK-24851][UI] Map a Stage ID to it's Associat...
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/21809#discussion_r205755070 --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusStore.scala --- @@ -112,10 +112,14 @@ private[spark] class AppStatusStore( } } - def stageAttempt(stageId: Int, stageAttemptId: Int, details: Boolean = false): v1.StageData = { + def stageAttempt(stageId: Int, stageAttemptId: Int, --- End diff -- Changing the return type to (StageData, jobIds) might be simpler. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21809: [SPARK-24851][UI] Map a Stage ID to it's Associat...
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/21809#discussion_r205752305 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala --- @@ -105,16 +105,29 @@ private[ui] class StagePage(parent: StagesTab, store: AppStatusStore) extends We val stageAttemptId = parameterAttempt.toInt val stageHeader = s"Details for Stage $stageId (Attempt $stageAttemptId)" -val stageDataWrapper = parent.store.stageAttempt(stageId, stageAttemptId, details = false) -val stageData = parent.store - .asOption(stageDataWrapper.info) - .getOrElse { +var stageDataWrapper: StageDataWrapper = null +try { + stageDataWrapper = parent.store.stageAttempt(stageId, stageAttemptId, details = false) +} catch { + case e: NoSuchElementException => e.getMessage +} +var stageData: StageData = null +if (stageDataWrapper != null) { + stageData = parent.store +.asOption(stageDataWrapper.info) +.get +} else { + stageData = { --- End diff -- this code branch is unreachable. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21809: [SPARK-24851][UI] Map a Stage ID to it's Associat...
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/21809#discussion_r205754677 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala --- @@ -182,6 +198,15 @@ private[ui] class StagePage(parent: StagesTab, store: AppStatusStore) extends We {Utils.bytesToString(stageData.diskBytesSpilled)} }} + {if (!stageJobIds.isEmpty) { + + Associated Job Ids: + {for (jobId <- stageJobIds) yield {val detailUrl = "%s/jobs/job/?id=%s".format( --- End diff -- Using `map` is more readable. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21584: [SPARK-24433][K8S] Initial R Bindings for SparkR on K8s
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21584 **[Test build #93662 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93662/testReport)** for PR 21584 at commit [`131f11f`](https://github.com/apache/spark/commit/131f11f8deb96fa7fa4f78522b73e5bbf2b9345e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21584: [SPARK-24433][K8S] Initial R Bindings for SparkR on K8s
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21584 **[Test build #93663 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93663/testReport)** for PR 21584 at commit [`1f0cba5`](https://github.com/apache/spark/commit/1f0cba59b650e4458e9472933068928d52a54777). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21584: [SPARK-24433][K8S] Initial R Bindings for SparkR on K8s
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21584 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1399/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21584: [SPARK-24433][K8S] Initial R Bindings for SparkR on K8s
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21584 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21584: [SPARK-24433][K8S] Initial R Bindings for SparkR on K8s
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21584 **[Test build #93662 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93662/testReport)** for PR 21584 at commit [`131f11f`](https://github.com/apache/spark/commit/131f11f8deb96fa7fa4f78522b73e5bbf2b9345e). * This patch **fails to build**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21584: [SPARK-24433][K8S] Initial R Bindings for SparkR on K8s
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21584 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93662/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21584: [SPARK-24433][K8S] Initial R Bindings for SparkR on K8s
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21584 Build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21584: [SPARK-24433][K8S] Initial R Bindings for SparkR on K8s
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21584 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21584: [SPARK-24433][K8S] Initial R Bindings for SparkR on K8s
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21584 **[Test build #93663 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93663/testReport)** for PR 21584 at commit [`1f0cba5`](https://github.com/apache/spark/commit/1f0cba59b650e4458e9472933068928d52a54777). * This patch **fails to build**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `trait BarrierTaskContext extends TaskContext ` * `class BarrierTaskInfo(val address: String)` * `class RDDBarrier[T: ClassTag](rdd: RDD[T]) ` * `case class WorkerOffer(` * `trait AnalysisHelper extends QueryPlan[LogicalPlan] ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21584: [SPARK-24433][K8S] Initial R Bindings for SparkR on K8s
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21584 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93663/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21584: [SPARK-24433][K8S] Initial R Bindings for SparkR on K8s
Github user ifilonenko commented on the issue: https://github.com/apache/spark/pull/21584 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21584: [SPARK-24433][K8S] Initial R Bindings for SparkR on K8s
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21584 **[Test build #93664 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93664/testReport)** for PR 21584 at commit [`1f0cba5`](https://github.com/apache/spark/commit/1f0cba59b650e4458e9472933068928d52a54777). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21584: [SPARK-24433][K8S] Initial R Bindings for SparkR on K8s
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21584 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21584: [SPARK-24433][K8S] Initial R Bindings for SparkR on K8s
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21584 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1400/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21584: [SPARK-24433][K8S] Initial R Bindings for SparkR on K8s
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21584 **[Test build #93664 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93664/testReport)** for PR 21584 at commit [`1f0cba5`](https://github.com/apache/spark/commit/1f0cba59b650e4458e9472933068928d52a54777). * This patch **fails to build**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `trait BarrierTaskContext extends TaskContext ` * `class BarrierTaskInfo(val address: String)` * `class RDDBarrier[T: ClassTag](rdd: RDD[T]) ` * `case class WorkerOffer(` * `trait AnalysisHelper extends QueryPlan[LogicalPlan] ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21584: [SPARK-24433][K8S] Initial R Bindings for SparkR on K8s
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21584 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93664/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21584: [SPARK-24433][K8S] Initial R Bindings for SparkR on K8s
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21584 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21802: [SPARK-23928][SQL] Add shuffle collection function.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21802 **[Test build #93661 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93661/testReport)** for PR 21802 at commit [`4135690`](https://github.com/apache/spark/commit/4135690f2cf1eea375a1a4f1697c0ffdb7436627). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21802: [SPARK-23928][SQL] Add shuffle collection function.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21802 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21802: [SPARK-23928][SQL] Add shuffle collection function.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21802 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93661/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org