Hello! I'm working on a research project, and I also happen to be relatively new to Hadoop/MapReduce. So apologies ahead of time for any glaring errors.
On my local machine, my project runs within a JVM and uses a Java API to communicate with a Prolog server to do information lookups. I was planning on deploying my project as the mapper during the MR job, but I am unclear on how I would access the Prolog server during runtime. Would it be O.K. To just let the server live and run on each data node while my job is running, and have each mapper hit the server on its respective node? (let's assume the server can handle the high volume of queries from the mappers) I am not even remotely aware of what types of issues will arise when the mappers (from each of their JVMs/process) query the Prolog server (running in its own single & separate process on each node). They will only be querying data from the server, not deleting/updating. Anything that would make this impossible or what I should be looking out for? Thanks -Robert