[
https://issues.apache.org/jira/browse/TIKA-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18045911#comment-18045911
]
ASF GitHub Bot commented on TIKA-4579:
--------------------------------------
nddipiazza opened a new pull request, #2465:
URL: https://github.com/apache/tika/pull/2465
## JIRA Ticket
https://issues.apache.org/jira/browse/TIKA-4579
## Summary
Adds the ability to update existing fetcher/emitter configurations at
runtime without using reflection hacks. The TikaGrpcServer previously had to
use reflection to forcibly clear the cache when updating fetchers because
saveComponent() would throw an exception for duplicate IDs.
## Changes
- **AbstractComponentManager.saveComponent()**: Changed behavior to support
updates instead of throwing exception when component ID already exists
- Removed duplicate ID check that threw TikaConfigException
- When updating existing component, cache is cleared to force
re-instantiation
- Added logging to distinguish between creating new vs updating existing
configs
- **TikaGrpcServerImpl.saveFetcher()**: Removed reflection hack
- Deleted reflection-based code that was forcibly clearing the cache
- Now simply calls fetcherManager.saveFetcher() which handles updates
properly
- **Updated JavaDocs**: Modified documentation for FetcherManager,
EmitterManager, and AbstractComponentManager
- Changed from "adds a component" to "adds or updates a component"
- Removed mentions of exceptions for duplicate IDs
- **Updated Tests**: Modified FetcherManagerTest
- Changed test from expecting TikaConfigException to verifying update
behavior
- Verifies that updating a fetcher clears the cache and creates a new
instance
- Ensures config store contains only one fetcher after update
## Use Case
1. tika-grpc server starts with no fetcher configs in tika-config (blank
slate)
2. Users call saveFetcher gRPC method to create new fetcher configurations
3. Users can then use those fetchers
4. Users can update/modify existing fetcher configurations without reflection
## Testing
- Modified existing test to verify update behavior
- All existing tests pass
- Verified cache is properly cleared on update
## Security Note
This functionality stores configurations in-memory only. Since tika-grpc is
secured via mutual TLS, only authorized users can modify configurations at
runtime.
> Add the ability to save pipes configs
> -------------------------------------
>
> Key: TIKA-4579
> URL: https://issues.apache.org/jira/browse/TIKA-4579
> Project: Tika
> Issue Type: Sub-task
> Reporter: Nicholas DiPiazza
> Priority: Major
>
> The fetcher and emitter managers need the ability to save/update
> configurations at runtime.
> h2. Background
> The TikaGrpcServer currently uses reflection hacks to update fetcher
> configurations because the FetcherManager.saveFetcher() method threw an
> exception when trying to save a fetcher with an ID that already exists.
> h2. Use Case
> A practical scenario for this functionality:
> # tika-grpc server starts with no fetcher configs in the tika-config (blank
> slate)
> # Users call the saveFetcher gRPC method to create new fetcher configurations
> # Users can then use those fetchers
> # Users may need to update/modify existing fetcher configurations
> h2. Solution Implemented
> Modified the AbstractComponentManager.saveComponent() method to support both
> creating new and updating existing component configurations.
> h3. Changes Made:
> *AbstractComponentManager.saveComponent()* - Changed behavior from throwing
> exception on duplicate IDs to supporting updates:
> * Removed the duplicate ID check that threw TikaConfigException
> * When updating an existing component, the cached instance is cleared to
> force re-instantiation
> * Added logging to distinguish between creating new configs vs updating
> existing ones
> *TikaGrpcServerImpl.saveFetcher()* - Removed reflection hack:
> * Deleted the reflection-based code that was forcibly clearing the cache
> * Now simply calls fetcherManager.saveFetcher() which handles updates properly
> *Updated JavaDocs* - Modified documentation for:
> * AbstractComponentManager.saveComponent()
> * FetcherManager.saveFetcher()
> * EmitterManager.saveEmitter()
> * Changed from "adds a component" to "adds or updates a component"
> * Removed mentions of exceptions for duplicate IDs
> *Updated Tests* - Modified FetcherManagerTest:
> * Changed test from expecting TikaConfigException to verifying update behavior
> * Verifies that updating a fetcher clears the cache and creates a new instance
> * Ensures the config store contains only one fetcher after update
> h2. Security Note
> This "save" functionality stores configurations in-memory only. Since
> tika-grpc is secured via mutual TLS, only authorized users can modify
> configurations at runtime.
> h2. Technical Details
> * Component configurations are stored in a Map (configStore)
> * Component instances are cached in a separate Map (componentCache)
> * When updating an existing config, only the cache is cleared, not the config
> store entry
> * The new configuration will be instantiated lazily on next use via
> getComponent()
> * Runtime modifications require allowRuntimeModifications=true when loading
> the manager
--
This message was sent by Atlassian Jira
(v8.20.10#820010)