On 11/4/14, 3:40 PM, Matthew Jordan wrote:
On Tue, Nov 4, 2014 at 12:57 PM, BJ Weschke <bwesc...@btwtech.com> wrote:
  Matt -

  This is a pretty neat idea, indeed, but I've got some questions/thoughts on
implementation. :-)   Apologies if all of this was already
considered/accounted for already..

  1) Does the entire file need to be downloaded and in place on the HTTP
Media Cache before you can call an ast_openstream on it? This could cause
some problems with larger files not sitting on a fat pipe local to the
Asterisk instance.
It does need to be completely on the local file system, which would be
a problem for extremely large files and/or slow network connections.

The ability to do an 'asynchronous' version of this is not really
present. The filestream code in the core of Asterisk doesn't have
anything present that would allow it to buffer the file partially
before playing back with some expected max size. If we went down that
road, it'd almost be a completely separate filestream concept from
what we have today, which is pretty non-trivial.

I don't think I have a good solution for really large files just yet.
There's some ways to do this using cURL (where we get back a chunk of
binary data, buffer it, and immediately start turning it into frames
for a channel) - but that feels like it would need a lot of work,
since we'd be essentially creating a new remote filestream type.
I know there's going to be a large population of Asterisk users that will want the simplicity of just specifying a URI for playback and expecting "sorcery" to happen. A decent number of them may even be OK with what may be a sub-second awkward silence to the caller on the line while things like the servicing thread synchronously queues the URI resource into the local HTTP media cache before playback. That's probably going to be an acceptable experience for a decent number of functional use cases. However, I think one somewhat common use case where this wouldn't go so well would be a list of URI resources that weren't already in HTTP media cache since they'd be fetched serially in-line at the time where playback really should be starting and block the channel with silence until the resource is set in media cache.

eg - Playback(http://myserver.com/monkeys.wav&http://myserver.com/can.wav&http://myserver.com/act.wav&http://myserver.com/like.wav&http://myserver.com/weasels.wav) <--- On an empty HTTP Media cache, the previous app invocation would probably sound pretty bad to the first caller going through this workflow. :-)

Also, I think the inability to use & in a URI for playback really limits the usefulness of this change. I totally understand why the typical URI decode doesn't work, but perhaps a combination of a URI encoded & with an HTML entity representation is a suitable alternative? eg - (%26amp; == & in a URI in Playback and do that pattern replacement in the URI before any other URI decoding/encoding operations. Ya, I know, it's a hack, but not allowing multiple parameters in a loaded queryString URL is way too restricting IMHO).

  2) What kind of locking is in place on the design to prevent HTTP Media
Cache from trying to update an expired resource that's already in the middle
of being streamed to a channel?
Items in the cache are reference counted, so if something is using an
item in the cache while the cache is being purged, that is safely
handled. The buckets API (which is based on sorcery) assumes a 'if
you're using it, you can hold it safely while something else swaps it
out' model of management - so it is safe to update the entry in the
cache with something new while something else uses the old cached
entry. The 'local file name' associated with the URI would be created
with mkstemp, so the risk of collision with local file names is low.

In the same fashion, a local file that is currently open and being
streamed has a reference associated with it in the OS. Calling unlink
on it will not cause the file to be disposed of until it is released.

I had to do a little bit of reading up on the Bucket File API, but yes, that definitely resolves the concern I had, and that's pretty cool. :-)
  3) I think you need to also introduce a PUT method on HTTP Media Cache
because I can think of a bunch of scenarios where having a write operation
on func_curl may be lacking in the file needing to be retrieved (eg - trying
to pull ACL'd media from an S3 volume where you need custom HTTP request
headers, etc). We shouldn't try to architect/design for all of these
scenarios in Asterisk via a write operation on func_curl and a PUT to HTTP
Media Cache seems like a reasonable approach to handle that.

I had thought about this, but didn't have a strong use case for it -
thanks for providing one!

How about something like:

GET /media_cache - retrieve a List of [Sound] in the cache
PUT /media_cache (note: would need to have parameters passed in the body)
     uri=URI to retrieve the media from
     headers=JSON list of key/value pairs to pass with the uri
DELETE /media_cache?uri
     uri=URI to remove from the cache

Sounds data model would be updated with something like the following:
   "uri": {
        "required": false,
        "description": "If retrieved from a remote source, the
originating URI of the sound",
        "type": "string"
    },
    "local_timestamp": {
        "required": false,
        "description": "Creation timestamp of the sound on the local system",
        "type": "datetime"
    },
    "remote_timestamp": {
         "required": false,
         "description": "Creation timestamp of the sound as known by
the remote system (if remote)",
         "type": "datetime"
    }


Well, kind of. I think you're still envisioning using CURL behind the scenes using the input provided in the JSON body of the PUT to /media_cache to go and grab the resource from the remote server. If you go that way, I think not only should we handle custom headers, but it's probably also not unreasonable to provide a way to do basic/digest authentication for the GET call as well. However, instead of that, I had envisioned being able to do a PUT to /media_cache as a multipart MIME request where one part is the JSON descriptor and the second part is the binary resource itself you're looking to place into HTTP Media cache. The advantage of doing things this way is that if you're running call control via some sort of API, that API will know for certain when files/resources are ready to be played back and you don't run the risk of the awkward blocking silence scenario that you have above. However, when you do it this way, the URI description/parameter itself doesn't make too much sense because it's not really where the resource came from. I guess there's also a question as to whether or not we follow the true REST practice with using POST for a brand new resource and PUT for updates to existing resources.

As for the timestamps for deciding whether the local cache is dirty, I don't think we should try to reinvent the wheel here. We should stick what's already well established for stuff like this and use the entity tag (Etag) response header stored and then use the "If-None-Match" request header approach. Google does a much better job of explaining it than I can here: https://developers.google.com/web/fundamentals/performance/optimizing-content-efficiency/http-caching


--
_____________________________________________________________________
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-dev mailing list
To UNSUBSCRIBE or update options visit:
  http://lists.digium.com/mailman/listinfo/asterisk-dev

Reply via email to