In memory management design I described the IoTivity deficiencies in memory management. Here I describe the architectural and design changes that can eliminate those deficiencies.
A wise mobile game architect once told me that his games allocate all memory at startup, never touching the heap thereafter. My experience says this is a worthy goal for IoTivity.
Here are the primary strategies for robust, long term memory management:
Most of the coding work will involve reducing the number of heap allocations. The following tactics allow drastic reductions:
After reducing the number of allocations to a critical few, it is possible to rethink the overall structure:
Many developers' reaction to a constrained memory environment is to write code that minimizes the total size of allocated structures. This leads to issues like the ones described in the previous section. I submit that reliability and repeatability are far more important in a constrained environment than efficiency.
To illustrate this issue, consider a design where a complex structure is needed for each transaction. A developer finds that by allocating the component elements for only the size needed for a specific transaction, he can usually use half the memory required by a worst-case allocation of all the fields. Theoretically, the design can then usually run twice as many transactions simultaneously. In practice, random variations will result in one or the other of the theoretical allocations being larger than average, so in many cases, there will still only be enough memory to run one transaction at a time. Worse, any other demands on the memory, especially ones with lasting allocations, will have an outsized effect on the memory. Furthermore, allocating a second structure while the first is being freed (unlikely to be an atomic operation), can easily result in the second failing buffer failing to allocate (usually when it is almost fully allocated). And while the second one is deallocating due to that failure, another one can arrive and have the same issue, with repeats possible. Using most of the memory as a single, fully allocated buffer, ensures that at least one transaction can take place and eliminates the allocation/freeing overhead, perhaps allowing faster transaction throughput.
I'm not asking you to believe the previous paragraph, but I wanted to illustrate some of the issues involved. The real situation is much more complex and the likelihood of unexpected behavior is high. I believe the certainty of static allocation, the reduced malloc/free time of static allocation, and the coding simplicity make static allocation preferable.
When the number of allocated structures has been reduced to a manageable number, such as 5-10, the contents of the structures should be carefully analyzed to look for redundancies and overages. The structures of a constrained environment should be carefully constructed and analyzed, and their sizes should be easily adjusted with macros or build variables.
Once it is possible to build an IoTivity server, the sizes of some structures for a server might be different than the values canned into a combined client-server. For example, the resource URI strings of a client must be big enough for any server it might talk to, including ones not designed yet, but the URI strings of a server need only be large enough to reference the server itself. URI strings can be a significant memory allocation, and that size can be known with careful analysis.
The structure size parameters should be put in the same place and explained adequately for server application developers to manipulate.
Memory allocation failure is different that most errors that will occur. Allocation failure is a system level failure, and it is almost always fatal to the system that sees it. It may be the result of processing a specific request, but it can also just result from a request that happens to be the 1000th request to be processed. It is typically a resource failure rather than a protocol violation. And it is likely to be the last real failure that node sees.
As a special error, memory allocation failure should be treated carefully. When it occurs, it should invoke a carefully designed reporting path that has priority over everything else and can't fail. By “can't fail” I mean that there should not be any memory allocations in the path of reporting it, or, if a buffer is needed, the availability of that buffer should be guaranteed. One way to guarantee a buffer is to pre-allocate it and only use it to report resource allocation failures.
The response to an allocation failure should also make clear that is an allocation failure (and likely the last sane message coming from that node.) As I mentioned previously, IoTivity reports some memory allocation failures with a generic OC_STACK_ERROR, losing critical information.
I hope I have made the case that fixing IoTivity's memory issues will require significant effort and the result will be a radically transformed IoTivity. It will require a major act of will on the part of the IoTivity development community to make the changes I present.
But there is a path forward. In memory management design III I describe an IoTivity fork that includes most of the changes presented here.
Intel OTC OIC development