Our Open Source framework includes some optimized asm alternatives
to RTL's move() and fillchar(), named
MoveFast() and FillCharFast().
We just rewrote from scratch the x86_64 version of those,
which was previously taken from third-party snippets.
The brand new code is meant to be more efficient and maintainable. In
particular, we switched to SIMD 128-bit SSE2 or 256bit AVX memory access (if
available), whereas current version was using 64-bit regular registers. The
small blocks (i.e. < 32 ...