This is logical on its face but ends up running contrary to how other game systems work in practice and is, in effect, double dipping on simulating time delay because time was already spent by the player to perform the command. I believe any situation where there is more than 10 seconds of delay from a command being issued by a player, and something happening in-game, will result in players avoiding those systems or feeling the game is not functioning properly; and the delays created by this mechanism can easily stretch into a minute or longer.
For comparison: When a character performs a long pose, there is two elements to the 'time' it occupies. The time to perform the mechanical typing of the command by the player, and the virtual time assumed to have elapsed over the course of the pose which can be assumed to have run concurrently with other events occurring in the room around the same time.
However with robot poses, there appears to be three elements involved: The time it takes for a player to type a pose, the artificial simulation of the time it takes for a character to enter that same command, and then finally the virtual time that is assumed to have elapsed while the pose then occurs. Because the virtual time is zero in real terms, and because poses are effectively the only common long command, that means that (as compared to a character posing) the 'roleplaying speed' of a robot is penalized twice and ends up being incredibly slow and quickly falling out of sync with other characters.
While it makes a lot of sense to have a skill/stat element to robot control, I feel like this per word/character artificial entry delay is unfun in practice, and shouldn't be extending into a minute plus or longer, especially because it only really ends up penalizing roleplay which is the only real avenue for players starting out using these systems.
Instead of a delay per character issued, something like a flat maximum delay of 10 seconds at the lowest possible skill threshold, and zero delay at the highest, would end up being far more usable and fun, and would also avoid creating the impression of poor game performance or non-functional mechanics.