Anastasios it looks like you already identified the two lines that
need to change, the string interpolation that depends on
UUID.randomUUID and metadataPath.hashCode.

I'd factor that out into a function that returns the group id.  That
function would also need to take the "parameters" variable (the map of
user-provided options) and look for a prefix for the group id,
defaulting to the current behavior.

If you have questions, feel free to ping me on the jira, or get as far
as you can and submit a PR for more discussion.
On Mon, Nov 19, 2018 at 2:38 PM Anastasios Zouzias <zouz...@gmail.com> wrote:
>
> Hi Tom,
>
> I initiated an issue here: https://issues.apache.org/jira/browse/SPARK-26121
>
> Feel free to edit/update the ticket. If someone familiar with the codebase 
> has any suggestion on the proper way of fixing this, I could work on it.
>
> Best,
> Anastasios
>
> On Mon, Nov 19, 2018 at 4:31 PM Tom Graves <tgraves...@yahoo.com> wrote:
>>
>> This makes sense to me and was going to propose something similar in order 
>> to be able to use the kafka acls more effectively as well, can you file a 
>> jira for it?
>>
>> Tom
>>
>> On Friday, November 9, 2018, 2:26:12 AM CST, Anastasios Zouzias 
>> <zouz...@gmail.com> wrote:
>>
>>
>> Hi all,
>>
>> I run in the following situation with Spark Structure Streaming (SS) using 
>> Kafka.
>>
>> In a project that I work on, there is already a secured Kafka setup where 
>> ops can issue an SSL certificate per "group.id", which should be predefined 
>> (or hopefully its prefix to be predefined).
>>
>> On the other hand, Spark SS fixes the group.id to
>>
>> val uniqueGroupId = 
>> s"spark-kafka-source-${UUID.randomUUID}-${metadataPath.hashCode}"
>>
>> see, i.e.,
>>
>> https://github.com/apache/spark/blob/v2.4.0/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala#L124
>>
>> I guess Spark developers had a good reason to fix it, but is it possible to 
>> make configurable the prefix of the above uniqueGroupId 
>> ("spark-kafka-source")? If so, I could prepare a PR on it.
>>
>> The rational is that we do not want all spark-jobs to use the same 
>> certificate on group-ids of the form (spark-kafka-source-*).
>>
>>
>> Best regards,
>> Anastasios Zouzias
>
>
>
> --
> -- Anastasios Zouzias

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to