[ https://issues.apache.org/jira/browse/TIKA-4181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17832336#comment-17832336 ]
ASF GitHub Bot commented on TIKA-4181: -------------------------------------- bartek commented on code in PR #1702: URL: https://github.com/apache/tika/pull/1702#discussion_r1544981545 ########## tika-pipes/tika-grpc/src/main/proto/tika.proto: ########## Review Comment: For your consideration @nddipiazza, I ran `buf lint` on this protobuf (as I am syncing it to a local repository for development purposes) and here's the report: ``` services/tika/pbtika/tika.proto:29:9:Service name "Tika" should be suffixed with "Service". services/tika/pbtika/tika.proto:35:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs. services/tika/pbtika/tika.proto:35:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs. services/tika/pbtika/tika.proto:36:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs. services/tika/pbtika/tika.proto:36:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs. services/tika/pbtika/tika.proto:36:40:RPC request type "FetchAndParseRequest" should be named "FetchAndParseServerSideStreamingRequest" or "TikaFetchAndParseServerSideStreamingRequest". services/tika/pbtika/tika.proto:37:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs. services/tika/pbtika/tika.proto:37:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs. services/tika/pbtika/tika.proto:37:50:RPC request type "FetchAndParseRequest" should be named "FetchAndParseBiDirectionalStreamingRequest" or "TikaFetchAndParseBiDirectionalStreamingRequest". services/tika/pbtika/tika.proto:42:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class". services/tika/pbtika/tika.proto:52:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class". services/tika/pbtika/tika.proto:61:10:Field name "fetcherName" should be lower_snake_case, such as "fetcher_name". services/tika/pbtika/tika.proto:62:10:Field name "fetchKey" should be lower_snake_case, such as "fetch_key". services/tika/pbtika/tika.proto:67:10:Field name "fetchKey" should be lower_snake_case, such as "fetch_key". services/tika/pbtika/tika.proto:85:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class". services/tika/pbtika/tika.proto:90:9:Field name "pageNumber" should be lower_snake_case, such as "page_number". services/tika/pbtika/tika.proto:91:9:Field name "numFetchersPerPage" should be lower_snake_case, such as "num_fetchers_per_page". services/tika/pbtika/tika.proto:95:28:Field name "getFetcherReply" should be lower_snake_case, such as "get_fetcher_reply". Generating protobufs for ./proto/pbingest services/tika/pbtika/tika.proto:29:9:Service name "Tika" should be suffixed with "Service". services/tika/pbtika/tika.proto:35:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs. services/tika/pbtika/tika.proto:35:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs. services/tika/pbtika/tika.proto:36:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs. services/tika/pbtika/tika.proto:36:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs. services/tika/pbtika/tika.proto:36:40:RPC request type "FetchAndParseRequest" should be named "FetchAndParseServerSideStreamingRequest" or "TikaFetchAndParseServerSideStreamingRequest". services/tika/pbtika/tika.proto:37:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs. services/tika/pbtika/tika.proto:37:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs. services/tika/pbtika/tika.proto:37:50:RPC request type "FetchAndParseRequest" should be named "FetchAndParseBiDirectionalStreamingRequest" or "TikaFetchAndParseBiDirectionalStreamingRequest". services/tika/pbtika/tika.proto:42:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class". services/tika/pbtika/tika.proto:52:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class". services/tika/pbtika/tika.proto:61:10:Field name "fetcherName" should be lower_snake_case, such as "fetcher_name". services/tika/pbtika/tika.proto:62:10:Field name "fetchKey" should be lower_snake_case, such as "fetch_key". services/tika/pbtika/tika.proto:67:10:Field name "fetchKey" should be lower_snake_case, such as "fetch_key". services/tika/pbtika/tika.proto:85:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class". services/tika/pbtika/tika.proto:90:9:Field name "pageNumber" should be lower_snake_case, such as "page_number". services/tika/pbtika/tika.proto:91:9:Field name "numFetchersPerPage" should be lower_snake_case, such as "num_fetchers_per_page". services/tika/pbtika/tika.proto:95:28:Field name "getFetcherReply" should be lower_snake_case, such as "get_fetcher_reply". Generating protobufs for ./services/tika/pbtika services/tika/pbtika/tika.proto:29:9:Service name "Tika" should be suffixed with "Service". services/tika/pbtika/tika.proto:35:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs. services/tika/pbtika/tika.proto:35:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs. services/tika/pbtika/tika.proto:36:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs. services/tika/pbtika/tika.proto:36:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs. services/tika/pbtika/tika.proto:36:40:RPC request type "FetchAndParseRequest" should be named "FetchAndParseServerSideStreamingRequest" or "TikaFetchAndParseServerSideStreamingRequest". services/tika/pbtika/tika.proto:37:3:"tika.FetchAndParseReply" is used as the request or response type for multiple RPCs. services/tika/pbtika/tika.proto:37:3:"tika.FetchAndParseRequest" is used as the request or response type for multiple RPCs. services/tika/pbtika/tika.proto:37:50:RPC request type "FetchAndParseRequest" should be named "FetchAndParseBiDirectionalStreamingRequest" or "TikaFetchAndParseBiDirectionalStreamingRequest". services/tika/pbtika/tika.proto:42:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class". services/tika/pbtika/tika.proto:52:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class". services/tika/pbtika/tika.proto:61:10:Field name "fetcherName" should be lower_snake_case, such as "fetcher_name". services/tika/pbtika/tika.proto:62:10:Field name "fetchKey" should be lower_snake_case, such as "fetch_key". services/tika/pbtika/tika.proto:67:10:Field name "fetchKey" should be lower_snake_case, such as "fetch_key". services/tika/pbtika/tika.proto:85:10:Field name "fetcherClass" should be lower_snake_case, such as "fetcher_class". services/tika/pbtika/tika.proto:90:9:Field name "pageNumber" should be lower_snake_case, such as "page_number". services/tika/pbtika/tika.proto:91:9:Field name "numFetchersPerPage" should be lower_snake_case, such as "num_fetchers_per_page". services/tika/pbtika/tika.proto:95:28:Field name "getFetcherReply" should be lower_snake_case, such as "get_fetcher_reply". ``` The [buf linter is pretty aggressive](https://buf.build/docs/lint/rules but I appreciate it for that. Here's the rules I've set: ``` lint: use: - DEFAULT except: - PACKAGE_VERSION_SUFFIX - RPC_RESPONSE_STANDARD_NAME - PACKAGE_DIRECTORY_MATCH rpc_allow_google_protobuf_empty_responses: true ``` > Grpc + Tika Pipes - pipe iterator and emitter > --------------------------------------------- > > Key: TIKA-4181 > URL: https://issues.apache.org/jira/browse/TIKA-4181 > Project: Tika > Issue Type: New Feature > Components: tika-pipes > Reporter: Nicholas DiPiazza > Priority: Major > Attachments: image-2024-02-06-07-54-50-116.png > > > Add full tika-pipes support of grpc > * pipe iterator > * fetcher > * emitter > Requires we create a service contract that specifies the inputs we require > from each method. > Then we will need to implement the different components with a grpc client > generated using the contract. > This would enable developers to run tika-pipes as a persistently running > daemon instead of just a single batch app, because it can continue to stream > out more inputs. > !image-2024-02-06-07-54-50-116.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)